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ABSTHACT 


\ 

As  a  part  of  the  operational  decision  aid  program  of  the  Office  of 
Naval  Research,  one  of  the  decision  aids  (a  strike  timing  aid)  developed  to 
date  was  tested  to  evaluate  comparatively  the  merit,  if  any,  of  the  full  aid 
and  its  various  components.  The  aid  was  developed  to  be  representative  of 
a  class  of  operational  decision  aids  which  provide  trend  output  information. 

Flight  ^experienced  ^  and  flight  ^inexperienced **  subjects  solved  ''hard^ 
and  *easy ^  strike  launch  time  problems  using  the  full  aid,  selected  portions 
of  the  aid,  and  without  the  aid.  The  results  were  analyzed  by  a  variety  of 
methods  and  supporter:  contentions  favoring  the  value  of  the  aid.  The  results 
suggested:  (1)  an  increase  in  decision  validity  by  a  factor  of  five,  when  unaided 
decisions  were  compared  with  aided  decisions,  (2)  a  quite  strong  achievement 
by  the  aid  of  its  goals,  and  (3)  differential  effectiveness  as  a  function  of  prob¬ 
lem  difficulty  and  the  experience  of  the  user. 

implications  for  aid  development  and  evaluations  of  such  aids  are 
also  presented. 


SUMMARY 


As  a  part  of  the  operational  decision  aid  program  of  the  Office  of 
Naval  Research,  one  of  the  decision  aids  developed  to  date,  the  strike  timing 
aid,  was  subjected  to  a  test  to  evaluate  comparatively  the  merit,  if  any,  of 
the  full  aid  and  its  various  components.  This  evaluation  is  one  of  a  set  of 
evaluations  of  various  decision  aids  developed  under  the  decision  aid  pio- 
gram  of  the  Office  of  Naval  Research. 


Description  of  the  Strike  Timing  Aid 

The  strike  timing  decision  aid  was  developed  by  Analytics,  Inc.  to 
be  representative  of  a  class  of  operational  decision  aids  which  provide  trend 
output.  The  version  tested  was  not  considered  to  be  in  a  "ready  for  use" 
state.  Essentially,  the  aid  is  based  on  a  mathematical  engagement  model 
which  predicts  the  outcome  of  an  air  stx-ike  as  a  function  of  strike  launch 
time.  On  the  basis  of  input  information,  the  aid  provides  two  types  of  user 
oriented  information:  (1)  projected  strike  outcome  information,  and  (2) 
expected  strike  utility.  The  projected  outcome  information  consists  of 
such  items  as  projected  own  losses,  projected  enemy  air  losses,  and  pro¬ 
jected  enemy  ground  losses.  The  expected  utility  information  presents  the 
"value"  of  a  strike  as  a  function  of  strike  launch  time.  This  value  is  cal¬ 
culated  as  a  function  of  "subjective"  values  assigned  by  the  aid's  user  to 
tne  loss  or  destruction  of  various  types  of  units  and  the  number  of  units 
(both  own  and  enemy)  expected  to  be  destroyed  at  various  launch  times. 

Other  user  oriented  features  are  also  provided  by  the  aid,  e.  g. , 
an  analysis  of  losses  by  mission  segment  and  a  sensitivity  analytic  feature. 


Method 


Flight  "experienced"  and  flight  "inexperienced"  groups  were  asked 
to  solve  "hard"  and  "easy"  strike  launch  problems  using  the  full  aid,  se¬ 
lected  portions  of  the  aid,  and  no  aid.  The  results  were  analyzed  relative 
to  five  hypotheses  concerning  the  utility  of  the  aid.  The  hypotheses 
concerned:  (1)  the  effectiveness  of  strike  launch  time  decisions  made  with 
the  use  of  the  aid  and  those  made  without  the  use  of  the  aid,  (2)  the  per¬ 
ceived  usefulness  of  the  aid,  (3)  the  effectiveness  and  perceived  usefulness 
as  a  function  of  the  experience  of  the  user  and  as  a  function  of  problem  dif¬ 
ficulty,  (4)  the  validity  of  the  aid,  and  (5)  the  effectiveness  of  decisions 
made  when  only  portions  of  the  aid  are  made  available  for  use. 


iii 


PRECEDING  PAGE  BLANK  -  NOT  ELLASED}! 


Data  pertinent  to  each  of  these  five  hypotheses  were  collected  and 
compared  with  criterion  data  reflecting  the  optimum  solution  to  each  prob¬ 
lem.  Each  participant  in  the  study  was  also  interviewed  concerning  his 
reaction  to  the  various  features  of  the  aid. 

Findings 

The  results  supported  contentions  favoring  the  value  of  the  aid. 
There  were  consistent,  statistically  significant  differences,  favoring  aid¬ 
ing,  between  the  unaided  condition  and  some  level  of  aiding.  The  data  sug¬ 
gested  an  increase  in  decision  validity  by  a  factor  of  five,  when  unaided  de¬ 
cisions  were  compared  with  aided  decisions.  The  results  of  a  multiattrib¬ 
ute  utility  analysis  indicated  that  the  aid  achieved  its  goals  quite  well. 

A  set  of  regression  analyses  indicated  that  the  aid  users  did  not  em¬ 
ploy  all  of  the  information  provided  by  the  aid  when  they  attempted  to  solve 
a  problem.  More  typically,  the  user  selected  one  or  two  aspects  which 
were  important  to  him  (e.g. ,  weather  at  target,  enemy  air  defense  readi¬ 
ness,  projected  own  losses)  and  based  his  final  strike  launch  time  choice 
on  that  (those)  considerations. 


There  was  some  evidence  of  a  differential  effectiveness  of  the  aid 
as  a  function  of  problem  difficulty  and  experience  of  the  user. 

The  interview  information  also  provided  support  for  contentions  fa¬ 
voring  the  value  of  the  aid.  While  some  reservations  were  expressed  about 
the  form  of  certain  output  displays,  most  experienced  participants  indicated 
that  they  would  use  such  an  aid  in  an  actual  opei  ational  situation — at  least 
as  a  supplement  to  other  information  on  hand. 
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I.  INTRODUCTION 


Since  1974,  the  Office  of  Naval  Research  (ONR)  has  been  investi¬ 
gating  me  feasibility  of  producing  tactical  level,  computer  based  decision 
aids  for  application  in  various  operational  situations.  From  the  outset, 
three  important  factors  governed  the  program's  development.  First, 
the  aids  were  intended  to  meet  the  needs  of  task  force  level  decision  makers 
and  planners--task  force  commanders  and  their  staffs.  Second,  new  meth¬ 
odologies  were  emphasized  which  could  handle  the  complexity  inherent  in 
naval  command  and  control  and  in  tactical  planning.  Third,  the  aids  were 
to  be  objectively  tested  and  evaluated. 

Objective,  experimental  testing  and  evaluation  were  emphasized 
on  at  least  three  levels  (Sinaiko,  1977).  During  the  early  stages,  each 
aid  was  to  be  tested  by  the  persons  responsible  for  the  aid's  design  and 
development  in  their  own  facilities.  When  an  aid  was  considered  suffi¬ 
ciently  ready,  it  was  to  be  evaluated  by  an  independent  agency.  Finally, 
the  appropriate  naval  user  organization  will  test  any  aid  being  considered 
for  use  in  the  Fleet. 


Organization  of  Decision  Aid  Program 

The  essential  structure  of  ONR's  operational  decision  aid  program 
rests  on  the  activities  of  a  variety  of  university  and  industrial  research 
and  development  organizations.  The  total  program  is  monitored  by  ONR 
through  a  steering  committee.  The  various  participating  organizations  are 
primarily  involved  in  the  development  of  the  aids  although  some  have  also 
addressed  specific  problems  that  bear  on  the  general  nature  and  effective¬ 
ness  of  such  aids  (e.g. ,  Lucas  and  Ruff,  1977;  Analytics,  Inc. ,  1976;  Brown, 
1978). 


The  Department  of  Decision  Sciences,  Wharton  School,  Uni¬ 
versity  of  Pennsylvania  supplies  computer  and  data  management  support 
systems  as  well  as  facilities  and  apparatus  for  demonstrating  and  testing 
the  aids. 

Applied  Psychological  Services  Is  the  organization  selected  to 
evaluate  the  aids  independently.  The  role  of  Applied  Psychological  Services 
is  to  serve  the  function  of  a  crucible--to  lest  critically,  rigorously,  and 
fairly  each  of  the  decision  aids  and  to  report  findings  and  recommendations 
for  improvement  of  the  various  aids.  Such  evaluations  are  to  be  conducted 
within  the  context  of  the  ONRODA  scenario  (Payne  and  Rowney,  1975;  Row- 
ney,  1975),  and  with  naval  personnel  serving  as  test  subjects. 


Aid  Evaluation 


Because  it  seemed  important  to  place  the  present  evaluation  program 
mto  the  context  of  a  total  decision  aid  developmental  framework,  a  conception 
(figure  1)  of  the  stops  to  be  followed  during  the  development  of  such  aids 
was  developed  (Siegel  and  Madden,  1 070) .  The  figure  is  read  from  the  bot¬ 
tom  to  top  with  the  considerations  involved  in  each  stage  entering  from  the 
left  of  each  bu„  and  the  results  of  each  stage  exiting  to  the  right.  The  Hum¬ 
berts)  above  each  Figure  1  box  represent  criteria  which  may  be  applied  af¬ 
ter  each  developmental  stage.  These  criteria  are  defined  in  Table  1.  The 
rounded  boxes  associated  with  each  rectangular,  stage  box  represent  descrip¬ 
tors  which  may  be  applied  as  the  criteria  at  the  successive  stages  are  mol. 
Accordingly,  an  aid  may  bo  successively  called  "suitable,"  "testable,"  "rea¬ 
sonable,"  "valid,"  "effective,"  and  "useful. "  Mote  tbat.  we  are  primarily  con¬ 
cerned  within  the  present  aid  evaluation  program  with  the  upper  right  box-- 
"validation  testing-exercise  in  lab  experiment  and  compare  with  intermediate 
criterion." 


Within  this  context,  the  work  possesses  a  number  of  characteristics: 

•  use  of  well-controlled,  precise,  multivariate  methods 

•  programmatic  approach 

•  full  coordination  with  ONU 

•  orientation  towards  possible  conditions  of  actual  aid 

use  in  the  Navy 

•  coordination  with  aid  developers  but  maintenance  of 

evaluation  integrity 

•  use  of  previous  developed  ONliODA  action  scenarios 

where  possible 


Purpose  of  Present  Work 

The  global  purpose  of  the  aid  evaluations  is  to  answer  for  each  aid 
such  questions  as: 

•  Does  it  work? 

•  Why  does  it  work? 

•  How  can  it  be  made  to  work  hotter? 

It  is  important  to  know  not  only  that  an  aid  docs  or  does  not  possess 
utility  but  also  which  of  its  characteristics  contribute  to  the  utility.  For 
example,  an  aid  might  bo  useful  because  it  synthesises  information  from  a 
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Table  1 


Criteria  for  Evaluating  the  Utility  of  a  Decision  Aid 
Criterion  Definition 


1.  Internal  consistency 


2.  Indifference  to  trivial 
aggregation 

3.  Correct  prediction  in  the 
extreme  (predictive  or 
empirical  validity) 

4.  Correct  prediction  in  mid 
range  (predictive  or  empirical 
validity) 

5.  Construct  validity 

6.  Content  (variable  parameter) 
validity  (Fidelity) 

7.  Realism  or  "face  validity" 

8.  Richness  of  output 

9.  Ease  of  use 


10.  Cost  of  development 

11.  Transportability-generality 

12.  Cost  of  use 


13.  Internal  validity 

14.  Event  or  time  series  validity 


Extent  to  which  the  constructs  of  the  aid 
are  marked  by  coherence  and  similarity  of 
treatment 

Potential  of  the  aid  to  avoid  major  changes 
in  output  when  input  groupings  or  conditions 
undergo  insignificant  fluctuations 

Extent  of  agreement  (correctness  of  pre¬ 
dictions)  between  the  aid  and  actual  perform* 
ance  at  very  high/low  values  of  conditions 

Like  above  for  middle  range  values  of 
conditions 

Theoretic  adequacy  of  the  aid's  eonsti'ucts 

Extent  to  which  the  aid’s  variables/ para¬ 
meters  match  real  life  conditions 

Extent  to  which  selected  content  matches 
each  attribute  included 

Number  and  type  of  output  variables  and 
forms  of  presentation 

Extent  to  which  an  analyst  can  readily  pre¬ 
pare  data  for,  apply,  and  extract  understand¬ 
able  results  from  the  aid 

Value  of  effort  to  conceive,  develop,  lest, 
document,  and  support 

Extent  of  applicability  to  different  Systems, 
missions,  and  configurations 

Value  of  all  effort  involving  use  of  aid  in¬ 
cluding  data  gathering,  input,  data  pro¬ 
cessing,  and  analysis  of  results 

Extent  to  which  outputs  are  repeatable  when 
inputs  are  unchanged 

Extent  to  which  aid  predicts  event  and  event 
patterns 
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diversity  of  areas  to  give  a  planner  new  insights;  or,  an  aid  might  signifi¬ 
cantly  reduce  the  amount  of  time  and  labor  required  to  make  a  decision; 
or,  it  might  allow  for  a  more  careful  analysis  of  a  broad  range  of  alter¬ 
natives,  Therefore,  in  attempting  to  examine  the  usefulness  of  any  aid, 
it  is  necessary  to  specify  the  factors  out  of  which  its  utility  may  have  been 
derived  and  how  they  may  interact. 

Linked  with  usefulness  is  a  concern  for  the  goals  of  an  aid,  their 
relative  importance,  and  how  closely  they  were  achieved.  The  goals  are 
objective  expressions  of  what  the  aid  should  be,  or  do,  or  facilitate.  There¬ 
fore,  an  examination  of  how  clearly  the  goals  of  an  aid  were  achieved  and 
their  relative  importance  should  lead  to  a  better  understanding  of  what  con¬ 
tributes  to  the  usefulness  of  an  aid  and  poss.bly  to  how  to  improve  it. 


Other  Evaluative  Considerations 


Other  aspects  of  a  decision  aid  which  must  be  considered  in  any 
thorough  evaluation  are  of  a  less  general  nature  than  those  already  dis¬ 
cussed.  But,  they  make  important  contributions  to  the  assessment  of  an 
aid.  These  considerations  relate  very  strongly  to  human  factors  consider¬ 
ations  and  include  three  general  groups: 

(a)  Nature  of  information 

(1)  Is  sufficient  information  provided  by  the  aid? 

(2)  Is  the  information  provided  pertinent?  Accurate? 

Timely?  In  the  required  form? 

(b)  Method  of  output  presentation 

(1)  Is  the  information  presented  in  a  manner  which 
is  responsive  to  user  requirements? 

(2)  Are  the  tables/graphs  and  other  output  format 
easily  comprehensible? 

(3)  Is  an  optimal  amount  of  information  presented 
in  each  table/ graph? 

(c)  User  ease 

(1)  Is  the  aid  relatively  easy  to  use? 

(2)  Are  the  user  commands  to  the  computer  system 
arranged  so  as  to  minimise  both  effort  and  con¬ 
fusion? 

(3)  Are  the  user  oriented  error  messages  under¬ 
standable  and  informative? 
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The  Strike  Timing  Decision  Aid 

This  report  presents  the  methods,  procedures,  and  results  of  the 
first  in  a  series  of  decision  aid  evaluative  studies.  The  report  is  concerned 
with  an  evaluation  of  Analytic's  Strike  Timing  Decision  Aid  (ASTDA).  The 
ASTDA  was  designed  as  a  tactical  decision  aiding  system  to  be  used  by  task 
force  level  flight  operations  officers.  Essentially,  the  ASTDA  is  based  on 
a  mathematical  engagement  model  which  predicts  the  outcome  of  Blue  (own) 
air  strikes  launched  against  Orange  (enemy)  forces.  The  aid  was  developed 
by  Analytics,  Inc. ,  within  the  framework  of  the  general  ONRODA  scenario 
(Payne  and  Rowney,  1975). 

ASTDA  Characteristics 

ASTDA  was  developed  as  a  prototype  of  a  class  of  aids  which  might 
support  any  decision  regarding  when  to  take  an  action  when  the  action  it¬ 
self  has  already  been  determined.  In  its  specific  implementation,  ASTDA 
is  intended  to  supply  flight  operations  officers  with  information  concerning: 
(1)  likely  combat  conditions  (e.  g. ,  weather  at  target,  nuiiibe"  and  type  of 
Orange  forces,  etc.)  at  various  future  points  in  time,  (2)  piobable  outcomes 
(e.  g. ,  number  and  kind  of  Blue  forces  lost,  etc.),  and  (3)  expected  utility 
(the  relative  value  of  an  air  strike  to  Blue)  of  air  strikes  launched  at  vari¬ 
ous  future  points  in  time. 

Several  other  sets  of  information  are  also  made  available  by  the 
ASTDA.  These  allow  the  operations  officer  to  examine  more  finely  other 
aspects  of  the  outcomes  of  a  projected  air  strike.  One  of  these  is  losses 
by  mission  segment.  This  informatior  indicates  the  Blue  losses  from  a 
specific  air  strike  as  a  function  of  air  mission  segment  (take-off,  ingress, 
at-target,  egress,  and  landing).  The  other  is  a  sensitivity  analysis  which 
allows  an  examination  of  the  sensitivity  of  the  expected  utility  to  changes 
in  Blue  or  Orange  forces. 

All  the  information  available  from  ASTDA  can  be  presented  in  both 
the  tabular  and  the  graphic  forms.  The  information  displays,  c.  .cept  those 
pertaining  to  weather  conditions  which  are  presented  as  probabilities,  are 
presented  as  means  and  delta  biased  uncertainty  bands.  A  delta  biased  un¬ 
certainty  band  is  a  concept,  developed  by  Analytics,  which  shows  the  two 
standard  deviation  interval  around  the  mean  "whose  midpoint  has  been  moved 
away  from  the  .  .  .  'mean'  of  the  distribution  by  an  amount  given  by  the 
parameter  delta"  (Glenn,  1978).  The  value  of  delta  used  is  selected  with 
the  purpose  of  correcting  for  skewness.  Accordingly,  a  delta  biased  uncer¬ 
tainty  band  never  includes  values  that  are  not  actually  within  the  range  of 
the  distribution. 


ASTPA  Information  Bass 

The  ASTDA  processes  relevant  information  to  calculate  air  strike 
result  predictions  for  various  strike  launch  times.  This  information  would 
normally  bo  supplied  by  weather  officers,  readiness  office!  s,  intelligence 
officers,  etc. 

The  ASTDA  requires  five  categories  of  input  (data  base)  information; 
the  first  three  concern  the  strength  and  resources  of  the  Blue  and  the  Orange 
forces,  while  the  last  two  concern  weather  conditions.  The  information  for 
the  first  three  categories  are  entered  as  means  and  standard  deviations. 
ASTDA  converts  the  standard  deviations  into  delta  biased  uncertainty  bands. 
The  information  concerning  the  Weather  is  entered  in  terms  of  the  probabil¬ 
ity  of  good  visibility.  Once  the  information  is  entered,  it  can  be  called  and 
displayed  in  either  the  tabular  or  the  graphic  forms.  The  displays  which  arc 
tnen  available  are  the  Blue  Force  Availability  (BFA),  the  Orange  Air  De¬ 
fenses  (ORAD),  the  Orange  Ground  Defenses  (OlvGD),  the  Weather  at  the 
Target  (WAT),  and  the  Weather  at  the  Carrier  (WAC)  as  a  function  of  time. 
For  the  purposes  of  the  present  evaluation,  the  input  information  was  sup¬ 
plied  to  the  subjects  and  was  preihsertori  into  the  system. 

In  the  present  work,  the  BFA  displays  indicated  the  relevant  informa¬ 
tion  for  one  type  of  Blue  fighter-interceptor  (the  BF1S),  and  for  two  types 
of  attack-bombers  (the  BBls  and  the  BBSs).  The  BFA  displays  also  included 
information  on  the  desired  number  of  Blue  aircraft  (DNB).  The  ORAD  dis¬ 
plays  contained  the  information  concerning  two  types  of  Orange  fighter-inter¬ 
ceptor  (the  OFls,  and  the  OF2s).  The  ORGD  displays  included  information 
on  two  types  of  ground  defense:  Obi  (surface-to-air  missiles),  and  002 
(anti-aircraft  artillery),  as  well  as  the  number  of  OTls  (passive  ground 
targets).  The  WAT  and  WAC  displays  indicated  the  probability  of  good  vis¬ 
ibility  at  the  target  and  at  the  carrier,  respectively.  The  probability  of  good 
visibility  at  the  target  was  shown  graphically  as  a  function  of  time  at  the  tar¬ 
get  while  that  for  visibility  at  the  carrier  was  plottco  as  a  function  of  the 
landing  time, 

ASTDA  Output 

ASTDA  produces  two  primary  typos  of  user  oriented  information; 

(1)  projected  outcome  information,  and  (2)  expected  strike  utility.  AS 
stated  previously,  this  information  is  presented  as  a  function  of  various 
strike  launch  times  and  is  made  available  in  both  the  graphic  and  the  tabu¬ 
lar  formats. 


Outcome,  The  information  entered  into  ASTDA  is  vised  in 
ah  engagement  model  which  predicts  the  results  of  air 
strikes  launched  at  the  Various  strike  launch  times.  The 
output  of  the  engagement  model  is  essentially  a  statement 


of  the  probable  number  and  kind  of  lost  or  destroyed  Blue 
and  Orange  forces.  The  strike  outcomes  in  the  evaluation 
were  available  in  three  displays:  Blue  Force  Losses  (BFL). 
Orange  Air  Losses  (ORAL)  and  Orange  Ground  Losses  (ORGL). 
The  likely  number  of  lost  BFls  (fighter -interceptors),  BBls 
and  BB2s  (attack-bombers)  across  each  prospective  strike 
launch  time  was  presented  in  the  BFL  displays.  The  ORAL 
displays  indicated  the  number  of  OFls  and  QF2s  (fighter- 
interceptors)  likely  to  be  destroyed  at  each  prospective  strike 
launch  time.  The  ORGL  displays  contained  the  number  of 
OD1  (surface-to-air  missiles),  OD2  (anti-aircraft-artillery), 
and  OT1  (passive  ground  targets)  which  would  probably  be 
destroyed  at  each  of  the  strike  launch  times.  The  information 
contained  in  each  of  these  displays  was  given  as  both  means 
and  delta  biased  uncertainty  bands. 


b.  Expected  Utility.  In  normal  ASTDA  employment,  to  deter¬ 
mine  the  expected  utilities  of  strikes  launched  at  various 
times,  the  operations  officer  would  first  assign  "subjective" 
values  to  the  loss  and  destruction  of  each  unit  type.  The 
value  assigned  to  each  unit  would  reflect  the  user's  judg¬ 
ments  of  importance  of  the  unit  to  the  overall  mission. 

In  the  present  evaluation,  these  judgments  were  assigned 
independently  of  the  subjects. 

ASTDA  computes  the  expected  utility  (EU)  by  taking  the  num¬ 
ber  of  units  of  each  type  (Blue  and  Orange)  predicted  by  the 
engagement  model  to  be  lost  or  destroyed,  multiplying  each 
by  its  assigned  value,  and  summing  across  the  units.  The 
resulting  sum  is  the  expected  utility  and  the  higher  the  ex¬ 
pected  utility  the  better  the  relative  outcome  of  the  ail*  strike 
for  Blue.  The  aid  computes  an  expected  utility  independently 
for  each  anticipated  strike  launch  time.  The  calculated  util¬ 
ities  are  presented  as  a  display  in  which  utility  is  plotted  as 
a  function  of  the  strike  launch  time.  The  strike  launch  time 
with  the  highest  expected  utility  would  therefore  be  considered 
the  best  time  to  launch  an  air  strike. 

Other  ASTDA  features,  of  a  less  direct  nature,  were  not  included 
in  the  present  evaluation.  These  included  the  analysis  of  the  losses  b,  mis¬ 
sion  segment  and  the  sensitivity  analytic  features. 


Specific  Purposes  of  Present  Study 

The  evaluation  plan  employed  for  the  ASTDA  evaluation  represents 
a  synthesis  of  an  initial  plan  developed  by  Applied  Psychological  Services 
and  one  proposed  by  Analytics,  Inc.  To  reach  a  consensus,  a  series  of 


8 


meetings  was  held.  Representatives  of  Applied  Psychological  Services, 
Analytics,  Inc.,  the  Department  of  Decision  Sciences  of  the  Wharton 
School  at  the  University  of  Pennsylvania,  and  the  ONR  participated  in 
these  meetings.  A  draft  of  the  consensus  was  prepared  (Siegel,  1978) 
and  submitted  to  the  other  cooperating  agencies.  The  final  plan  emerged 
as  the  result  of  comments  by  the  other  organizations  on  the  draft. 

The  evaluative  research  plan  was  developed  to  test  five  hypotheses 
concerning  the  effectiveness  of  the  strike  timing  decision  aid  under  labora 
tory  test.  The  five  hypotheses  were: 

Hypothesis  1.  More  effective  strike  timing  decisions  can  be 


made  using  the  ASTDA  than  without  the  aid. 

Hypothesis  2.  Users  will  perceive  the  aid  to  possess  value. 

Hypothesis  3.  Effectiveness  and  perceived  value  will  not  vary 

as  a  function  of  the  user's  operational  experience 
level  or  problem  difficulty. 

Hypothesis  4.  The  , strike  timing  decision  aid  possesses  criterion 
related  validity. 


‘■Hypothesis'S.  Decision  effectiveness  will  vary  systematically  as 
the  characteristics  of  the  aid  are  varied. 
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II.  METHODS  AND  PROCEDURES 

A  five  by  two  by  two  factorial  design  formed  the  basis  for  the  present 
evaluation.  The  three  independent  variables  were  aid  level,  test  problem 
difficulty  ("easy"  or  "hard"),  and  Navy  operational  experience  of  the  sub¬ 
jects  ("minimum"  or  "considerable"). 

Each  subject  was  presented  with  a  series  of  scenario  problems  for 
which  he  was  required  to  rank  order  launch  times  for  an  air  strike  from 
a  carrier  against  the  ONRODA  island. 

The  main  dependent  variable  was  the  launch  time  rankings  of  the 
subjects.  The  choice  behavior  was  compared  with  criteria  data  to  examine 
consistency  with  the  expert  Naval  opinion  and  the  best  alternative  indicated 
by  utility  values  as  indicated  by  the  aid.  The  details  are  described  below. 

Factors 

The  first  evaluation  design  factor  was  concerned  with  the  levels  of 
the  decision  aid.  Five  levels  were  investigated  in  an  attempt  to  determine 
the  effects  of  parts  of  the  aid  (aid  features)  in  isolation  as  compared  with 
the  aid  as  a  whole  and  with  a  no-aid  (control)  condition.  The  first  level 
was  the  full-aid  condition  in  which  all  five  input  (BFR,  ORAD,  ORGD,  WAT, 
and  WAC)  and  all  four  output  displays  (BFL,  ORAL,  ORGL,  and  EU)  were 
made  available  to  the  subject.  The  second  level  was  a  utility  condition  in 
which  the  five  input  displays,  (BFR,  ORAD,  ORGF,  WAT,  and  WAC)  and 
only  one  output  display  (EU)  were  made  available.  The  third  level  was  the 
outcome  condition  in  which  the  five  input  (BFR,  ORAD,  ORGL,  WAT,  and 
WAC)  and  three  outcome  displays  (BFL,  ORAL,  and  ORGL)  but  not  EU  were 
available.  The  fourth  was  a  no  uncertainty  bands  condition  in  which  the  five 
input  (BFR,  ORAD,  ORGF,  WAT,  and  WAC)  and  four  output  (BFL,  ORAL, 
ORGL,  and  EU)  displays  were  presented  but  only  as  means,  i.  e. ,  the  delta 
biased  uncertainty  bands  were  deleted  from  the  displays.  The  final  level 
was  an  unaided  control  condition  in  which  the  subjects  received  only  the  in¬ 
put  information  (BFR,  ORAD,  ORGF,  WAT,  and  WAC).  This  information 
was  made  available  only  in  the  tabular  form  for  the  fifth  condition  to  simu- 
'ate  what  might  currently  be  available  in  the  fleet. 

Hypothesis  1  was  tested  by  comparing  the  adequacy  of  subjects' 
strike  launch  time  choices  in  the  fully  aided  condition  with  the  adequacy 
of  their  choices  in  the  unaided  condition.  The  three  intermediate  aid 
levels  were  included  to  test  Hypothesis  5  and  to  isolate  the  contributions 
of  the  outcomes,  the  utilities,  and  the  uncertainty  bands  to  the  decisions 
made  in  the  full  aid  condition. 
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The  second  factor,  problem  difficulty,  was  varied  over  two  levels. 

Each  problem  was  classified  as  either  "easy"  or  "hard."  This  served  to 
test  partially  Hypothesis  3.  The  method  of  classifying  a  problem  as  "easy" 
or  "hard"  is  discussed  later  in  the  section  on  "Problem  Selection." 

The  final  factor  was  included  to  allow  at  least  partial  test  of  Hypo¬ 
thesis  3 — effectiveness  and  perceived  value  of  the  ASTDA  will  not  vary  as 
a  function  of  the  subjects'  operational  experience.  To  facilitate  the  testing 
of  this  hypothesis  subjects  were  sampled  from  two  populations;  those  with 
Naval  flight  oriented  experience  and  those  who  have  hot  had  such  experience. 

Criterion  Data 

Within  a  test  such  as  that  described  here,  the  criterion  (standard 
against  which  the  merit  of  the  aid  may  be  judged)  choice  represents  a  partic¬ 
ularly  difficult  problem.  Any  criterion  must  possess  such  attributes  as  re¬ 
liability,  analyzability,  objectivity,  quantifiability,  and  acceptability.  The 
first  four  of  these  criterion  requisites  are  psychometric  in  nature  and  are 
technically  manageable.  The  last  requisite,  acceptability,  refers  to  the  de¬ 
gree  that  others  will  accept  the  criterion  as  an  index  of  merit  or,  alterna¬ 
tively,  its  relevance.  Here  value  judgments  come  into  play.  At  the  extreme, 
the  merit  of  the  aid  during  wartime  use  might  be  the  only  acceptable  crite¬ 
rion.  Such  a  criterion  is,  of  course,  quite  impractical.  As  one  successively 
backs  off  from  this  ultimate  criterion,  he  becomes  more  and  more  involved 
with  intermediate  criteria.  Merit  during  a  fleet  exercise  might  represent  on 
intermediate  criterion  that  is  quite  proximal  to  the  ultimate. 

When  one  is  involved  with  a  laboratory  test,  as  in  the  present  work, 
the  available  criteria  are  more  remote.  Moreover,  the  conditions  of  a  lab¬ 
oratory  test,  no  matter  how  realistically  they  may  simulate  actual  conditions, 
will  only  remotely  resemble  shipboard  conditions  and  wartime  stresses.  The 
reader  may  ask,  "What  confidence  may  we  have  in  such  intermediate  criteria?  " 
The  answer  to  this  question  seems  to  be  that  if  the  aid  is  shown  to  possess 
merit  relative  to  the  intermediate  criteria,  it  may  possess  merit  relative  to 
more  ultimate  criteria.  If  the  aid  fails  to  possess  merit  relative  to  interme¬ 
diate  criteria,  it  probably  will  not  possess  merit  relative  to  an  ultimate  cri¬ 
terion. 

Two  sets  of  standards  or  intermediate  criteria  were  selected  against 
which  the  strike  launch  time  decisions  of  the  subjects  were  judged.  The 
first  criterion  was  the  launch  time  utility  as  predicted  by  the  ASTDA.  The 
second  criterion  was  the  launch  time  preferences  as  judged  by  a  panel  of 
experts.  With  these  criterion  data  on  hand,  the  agreement  between  the  ex¬ 
pert  opinion  and  the  utilities  predicted  by  the  aid  could  be  examined  and  used 
as  a  measure  of  aid  validity  (Hypothesis  4). 


© 


o 


Q 


J 


3 


Expert  Opinion  Criterion 

The  expert  based  criterion  data  were  obtained  through  the  c  ra¬ 
tion  of  the  ONR.  Four  Naval  officers — two  Captains,  one  Commander,  and 
one  Lieutenant  Commander — volunteered  to  form  a  panel.  The  pa.-el  may 
be  considered  to  be  "experts"  in  that  all  members  were  senior  in  rank,  _  ~- 
sessed  considerable  operational  experience,  and  were  familiar  with  the  na¬ 
ture  of  the  strike  timing  problem  and  the  ONRODA  scenario.  The  panel  met 
over  a  1. 5  day  period.  At  the  outset,  the  panel  was  briefed  on  the  purpose 
of  the  aid  evaluation,  the  ASTDA,  the  ONRODA  scenario,  assumed  own  and 
enemy  force  strengths  arid  fighting  capabilities  and  characteristics,  weather 
conditions,  and  the  problems  inherent  in  selecting  a  launch  time.  Following 
this,  each  member  of  the  panel  was  asked  to  work  independently  through  24 
scenario  problems  and  to  indicate,  for  each,  a  launch  time  ranking  and  the 
difficulty  of  the  probleiri. 

After  completing  this  independent  work,  the  participants  were  asked 
to  assemble  as  a  panel  arid  each  scenario  problem  was  reviewed.  In  the 
review,  each  participant  indicated  his  launch  time  ranking,  explained  the 
basis  for  his  decisions,  and  how  difficult  it  was  to  make  the  choice.  If  one 
or  more  judges  differed  in  the  preferred  launch  time  ranking  for  a  given 
scenario,  the  panel  attempted  to  understand  why  the  difference  occured. 
After  this  discussion,  the  participants  were  given  the  option  to  change  their 
launch  time  selections.  This  was  done  in  an  attempt  to  obtain  convergence 
and  a  consensus  as  to  the  best  launch  time  ranking  for  each  problem.  In 
the  case  of  lack  of  full  convergence,  the  median  launch  time  ranking  of  the 
experts  for  each  scenario  problem  was  subsequently  used  as  the  preferred 
launch  time. 


Aid  Generated  Utility  Criterion 

As  a  part  of  it-;  internal  logic,  the  ASTDA  generates  an  expected 
utility.  The  aid  generated  utility  values  were  employed  as  the  second  cri¬ 
terion  in  the  present  work. 


0 


© 
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Dependent  Variables 

As  rioted  earlier,  the  primary  dependent  variable  was  the  subjects' 
choices  of  preferred  air  strike  launch  times.  These  were  compared  with 
the  criterion  data. 

Two  other  dependent  measures  were  obtained;  (1)  a  statement  of 
the  perceived  difficulty  of  each  problem,  arid  (2)  a  statement  of  each  subject's 
confidence  in  the  correctness  of  his  launch  time  ranking  for  each  problem. 

Exhibit  I  summarizes  the  evaluation  design.  Exhibit  II  summarizes 
the  information  made  available  in  each  of  the  five  aid  levels. 
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Exhibit  I 


Summary  of  Evaluation  Design 


Background  Difficulty  _ _ _  Aid  Levels 


Full 

Aid 

Utility 

Outcome 

No 

Uncertainty 

Experienced 

Easy 

Hard 

n  =  6 

n  =  6 

n  =  6 

n  =  6 

Inexperienced 

Easy 

Hard 

n  =  6 

n  =  6 

n  =  6 

n  =  6 

Exhibit  II 

Information  Made  Available  in  Each  Aid  Level 

Level 

Information  Provided 

Input 

Utility 

Outcome 

Uncertainty 

Bands 

Full  Aid 

/ 

/ 

/ 

✓ 

Utility 

/ 

/ 

/ 

Outcome 

/ 

/ 

/ 

No  Uncertainty 

/ 

/ 

/ 

No  Aid 

✓ 

No 

Aid 

n  =  6 

n  =  6 
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Subjects 

Sixty  subjects  participated  in  the  study.  Half  (30)  of  the  subjects 
possessed  Naval  flight  oriented  experience  and  half  (30)  possessed  no  such 
experience.  *  The  experienced  subjects  were  recruited  from  a  variety  of  , 
sources,  including  Naval  Air  Reserve  units,  the  Naval  Air  Development 
Center,  and  advertisements  in  local  newspapers.  Their  Navy  ranks  were; 
Ensign  =  1;  Lieutenant  =12;  Lieutenant  Commander  =  S,  Commander  =  6; 
Captain=  1;  Marine  Corps  Captain3  1;  and  "unavailable”3  1.**  The  sample 
possessed  a  mean  of  9.  45  years  in  aviation,  a  mean  of  2440  flight  hours,  a 
mean  of  S8  carrier  landings,  a  mean  of  10.5  years  in  the  Navy,  and  a  mean 
of  14  months  of  carrier  duty. 

The  inexperienced  subjects  were  exclusively  midshipmen  in  the  NROTC 
program  at  the  University  of  Pennsylvania. 

Experienced  subjects,  except  officers  on  active  duty,  were  paid 
$30.00,  and  inexperienced  subjects  were  paid  $10.00  for  participating  in 
the  study. 


Apparatus 

The  evaluation  was  conducted  in  the  decision  aid  facility  established 
by  the  ONE  at  the  Department  of  Decision  Sciences,  Wharton  School,  Univer¬ 
sity  of  Pennsylvania.  The  timing,  presentation,  and  storage  of  experimental 
events  was  controlled  by  a  PD  P-10  computer.  The  subjects  could  enter  com¬ 
mands  into  the  system  through. a  Data  Media  terminal  and  the  resultant  dis¬ 
plays  were  projected  on  two  screens.  Tabular  displays  were  presented  on 
one  Data  Media  screen  and  the  color  graphics,  controlled  by  a  Grinnell  LSI- 
11  microprocesser,  were  shown  on  a  parallel  screen.  Figure  2  presents 
the  general  equipment  arrangement.  As  indicated  in  Figure  2,  stations  for 
system  support  personnei  and  the  evaluation  administrators  were  separated 
from  the  evaluation  area.  The  evaluation  conductors  were  able  to  observe 
the  information  displayed  for  the  subjects  and  their  activities  by  way  of  a 
special  monitor. 

Subject  Orientation  and  Training 

To  provide  a  full,  but  standardised,  orientation  to  each  subject,  a 
set  of  video  tapes  was  prepared.  Each  subject,  depending  on  the  evaluation 


^Although  useful  to  strike  planning,  flight  experience  is  not  a  critical  pre¬ 
requisite  for  strike  planning. 


'-"Where  a  subject  was  no  longer  on  active  duty,  his  rank  on  discharge  is  re¬ 
ported. 
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Schematic  of  area  and  equipment  used  in  ASTDA  evaluation. 


condition  to  which  he  was  assigned,  was  shown  two  tapes.  One  of  the 
tapes  provided  a  general  discussion  of  the  factors  that  live  ASTDA  con¬ 
sidered  to  predict  the  outcome  of  air  strikes,  e.g. ,  engagement  charac¬ 
teristics  and  probabilities.  All  subjects  viewed  this  hipe.  The  use  of 
the  Other  tapes  was  restricted-each  was  oriented  for  the  subjects  as¬ 
signed  to  a  specific  condition,  in  order  to  make  the  instructions  re¬ 
ceived  by  each  subject  as  standardized  as  possible,  special  care  was 
taken  with  the  construction  of  the  tapes.  The  tape  for  the  first  condi¬ 
tion,  the  full-aid  condition,  was  used  as  a  master  tape.  Wherever  pos¬ 
sible,  the  tapes  for  the  other  conditions  were  copies  of  the  first  except 
that,  depending  on  the  condition,  certain  scenes  were  edited  out.  lror 
example,  the  scenes  pertaining  to  the  expected  utility  were  deleted  from 
the  tape  used  for  the  subjects  in  the  outcome  condition  level.  Only  rare¬ 
ly  did  the  editing  prove  to  be  untenable  in  which  case  appropriate  scenes 
were  added. 

Problem  Selection 

Analytics,  Inc.  developed  the  bank  of  24  ASTDA  oriented  problems 
which  were  evaluated  by  the  experts.  Eight  of  these  problems  were  se¬ 
lected  for  inclusion  in  the  formal  evaluation. 

In  order  to  satisfy  the  needs  of  the  design,  problem  scaling  was 
required  along  a  difficulty  continuum.  The  expert  panel  indicated  that 
scenarios  were  easier  which  contained  large  differences  among  the  con¬ 
sequences.  Therefore,  one  basis  for  estimating  the  difficulty  differences 
between  problems  was  in  terms  of  the  spread  of  the  possible  outcomes  a- 
cross  the  strike  limes—tho  greater  the  spread  of  consequences  the  easier 
the  problem.  The  alternative,  that  close  solutions  might  be  easier  because 
a  guess  at  a  solution  or  tossing  a  coin  to  derive  a  solution  would  not  cause 
serious  differential  consequences  was  not  considered  by  the  panel-prob- 
abiy  because  serious  questions  are  not  answered  in  a  trivial  way. 

The  aid  supplies  a  general  measure  of  the  consequences  of  launch¬ 
ing  an  air  strike  at  aity  particular  lime  as  the  expected  utility.  A  state¬ 
ment  of  the  differences  between  the  expected  utilities  across  strike  launch 
times  could  accordingly  be  used  as  a  gauge  for  specifying  difficulty  which 
would  appear  to  be  congruent  with  the  guidance  of  the  export  panel.  There¬ 
fore,  difficulty  was  relatively  specified  by  the  standard  deviation  of  the  ex¬ 
pected  utilities  across  the  strike  times  in  each  problem  in  the  bank.  Pour 
problems  were  selected  which  had  utility  standard  deviations  ranging  from 
2.81  to  4,06  ("hard"  group)  and  four  were  assigned  to  the  "easy"  group 
which  possessed  standard  deviations  ranging  from  7.  94  to  13.  85. 

Exhibit  111  presents  an  example  of  one  problem  as  it  was  presented 
to  the  subjects. 


Exhibit  III 
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Example  of  Strike  Launch  Time  Problem 

Strike  Timing  Problem  G 

Your  task  U  to  select  a  strike  launch  time  between  0600  and  POO  hours  tomorrow  for  a  cyclical  strike  against  the 
Orange  Forces  on  Onroda  Island. 

Blue  Force  Readiness 

Number  Definitely 


Number  In  Repair  Which 


Average  Number  Expected  To  Be  Ready 


Unit 

Ready  At 

Might  Be  Ready  By 

At  Each  Time 

Type 

0600 

1100 

OGOO 

0700 

0600 

0900 

1000 

1100 

BF1 

10 

2 

10.9 

10.9 

10.9 

10.9 

10.9 

10.9 

BB1 

5 

1 

5.6 

5.6 

5.6 

5.  C 

5.6 

5.6 

BB2 

13 

3 

14.9 

14.9 

14,9 

14.9 

14.9 

14.9 

Orange  Air  Defenses 

Unit 

Time  of  Encounter 

Type 

0800 

0700 

0800 

0900 

1000 

1100 

OF1 

10. 7  (  7.8,  13.2) 

9. 8  (  6.6,  13.0) 

8.7<  5.4,  12.0) 

8. 0(  4.8,  11.2) 

7.5 (  4.5,  12.5) 

7.  3  (  4.4,  10.2) 

OF2 

21.9(17.3,  26.5) 

19.7(13.7.  25.7) 

17.8(11.7,  23.9) 

16.4(10.6,  32.2) 

15.4(10.0,  20.  8) 

14.  6  (  9.7,  19.5) 

Orange  Ground  Forces 

Unit 

Time  of  Encounter 

Type 

0700 

08i)0 

0900 

1000 

1100 

1200 

OD1 

8. 4(  6.6,  10.2) 

8. 4  (  6.6.  10.2) 

8.4<  6.6,  10.2) 

8.  4<  G.  6,  10; 2) 

8. 4 (  6.6,  10.2) 

8.  4  (  6.6,  10.2) 

OD2 

10.7  (  8.7,  12.7) 

11.  1(  9.1,  13,1) 

11. 5(  9.3,  13.7) 

11. 9(  9.7,  14.  1) 

12.  2  (  9.9,  15.5) 

12.6(10.3,  14.9) 

OT1 

20.7(17.3,  24.  1) 

20.7(17.3,  24.  1) 

20.7(17.3,  24.  1) 

20.7(17.3,  24.  1) 

20.7(17.3,  24.  1) 

20.7(17.3,  24.  1) 

Weather  Conditions 


Time  For  Weather  Prediction 


Location 

0700 

0800 

0900 

1000 

1100 

1200 

1300 

Target 

.90 

.87 

.84 

.81 

.78 

.75 

Carrier 

.90 

.87 

.84 

.31 

.78 

.75 

Decisions,  Judgments,  and  listings 

A  Special  computer  routine  was  developed  that  was  called  by  a 
subject  when  he  was  ready  to  record  his  decision(s)  for  a  problem.  The 
routine  queried  the  subject  about  his  choices  by  projecting  questions  on 
one  of  the  display  screens.  The  subject  was  required  to  rank  order  six 
potential  strike  launch  times  from  the  "best'1  to  "worst,"  The  first  ques¬ 
tion  asked  the  subject  to  indicate  (from  a  choice  of  six)  the  time  lie  thought 
best  to  aluuch  an  air  strike  for  the  scenario  involved  in  the  problem.  Af¬ 
ter  the  question  was  displayed,  the  system  waited  for  an  answer  to  be  en¬ 
tered,  Once  the  information  Was  entered,  the  system  asked  for  a  second 
best  strike  launch  time  to  be  entered.  After  the  strike  liming  decisions 
were  collected,  the  routine  then  successively  presented  three  more  ques¬ 
tions.  The  first  question  inquired  into  the  subject's  confidence  in  lus  de¬ 
cisions  While  the  oilier  two  were  concerned  with  the  subject's  perceived 
difficulty  in  reaching  a  decision  for  ihe  problem.  The  first  asked  the  sub¬ 
ject  to  indicate  Ills  degree  of  confidence  in  lus  ranking  along  a  five  cate¬ 
gory  confidence  scale  which  was  projected  on  one  of  the  display  screens 
along  with  the  question,  The  scale  ranged  from  "1"  (not  at  all  confident) 
through  "5"  (completely  confident).  The  subject  answered  by  entering,  by 
means  of  the  keyboard,  the  number  appropriate  to  bis  level  of  confidence. 

The  first  difficulty  oriented  question  asked  the  subject  to  indicate 
perceived  problem  difficulty  on  a  five  point  difficulty  scale,  The  scale 
ranged  from  "1"  (uol  at  all  difficult)  through  "5"  (very  difficult)  and  again 
the  subject  responded  by  typing  in  his  judgment.  The  second  perceived 
difficulty  oriented  question  (third  question)  was  structured  by  a  magnitude 
estimation  technique  and  required  the  subject  to  rate  the  difficulty  of  the 
problem  he  had  just  finished  in  ret-  ion  to  a  modulus.  The  second  practice 
(see  "Procedure"  section)  problem  was  used  as  the  modulus  and  assigned 
an  arbitrary  value  of  100.  It  Was  expected  that  a  scenario  problem  which 
was  perceived  to  be  twice  as  difficult  as  the  modulus  would  be  assigned  a 
value  of  300  while  one  perceived  lo  be  half  as  difficult  would  be  assigned  a 
value  of  80,  The  subject  indicated  bis  rating  by  entering  through  the  key¬ 
board  a  representative  number  in  response  to  the  routine's  query, 

Procedure--Qvervlcw 

Bach  subject  was  classified  as  either  "inexperienced"  or  "experi¬ 
enced"  and  randomly  assigned  to  one  of  the  five  aid  levels. 

The  sequence  of  presentation  of  the  scenario  problems  was  ran¬ 
domised  across  subjects  with  the  limitation  that  no  two  problems  from 
the  same  difficulty  category  could  be  followed  by  a  third. 


Two  data  collection  sessions,  which  lasted  about  three  hours  each, 
were  required  for  each  subject.  The  time  was  divided  between  an  instruc¬ 
tional  and  a  testing  phase.  The  instructional  phase  required  about  one  hour 


During  this  time,  two  video  tapes  were  shown  and  the  subject  used  the  aid 
to  solve  two  practice  problems.  The  testing  phase  followed.  Following 
problem  completion  an  interview  inquiry  was  completed. 


Subject  Training 


Prior  to  actual  practice  using  the  aid,  each  subject  viewed  two 
video  tapes.  The  first  tape  was  general  and  appropriate  to  all  conditions. 
The  second  video  tape  shown  to  each  subject  was  the  one  which  was  spe¬ 
cifically  fitted  to  the  aid  condition  to  which  the  subject  was  assigned. 

The  first  tape  initially  discussed  the  purposes  of  the  testing,  what 
was  expected  of  the  subjects,  the  ONRODA  scenario  and  the  factors  which 
the  ASTDA  considers  in  evaluating  the  outcome  of  an  air  strike.  The  fac¬ 
tors  included  survival  probabilities,  engagement  characteristics,  and  tac¬ 
tical  considerations.  The  survival  probabilities  discussed  were*  (1)  Blue 
aircraft  survival  probabilities  against  Orange  air  and  ground  forces  in 
one-on-one  engagements,  (2)  the  conditional  probabilities  of  Blue  bombers 
evading  Orange  aback  fighters  during  ingress  and  successfully  returning 
to  the  carrier,  (3)  Orange  air  and  ground  force  survival  probabilities  a- 
gainst  Blue  aircraft  in  one-on-one  engagements,  (4)  survival  probabilities 
for  various  numbers  of  Blue  fighter/ interceptors  against  various  numbers 
of  Orange  attack  fighters,  and  (5)  survival  probabilities  for  various  num¬ 
bers  of  Orange  attack  fighters  against  various  numbers  of  Blue  fighter/ in¬ 
terceptors. 

Other  factors  discussed  in  this  tape  concerned  limitations  on  the 
number  of  simultaneous  attacks  of  each  unit  and  the  tactical  assignment 
of  Blue  units  to  various  Orange  ground  targets.  The  survival  probabil¬ 
ities,  engagement  characteristics,  and  tactical  assignment  were  discussed 
for  both  "good"  and  "bad"  weather  contingencies. 

One  other  factor  which  was  described  was  the  relative  value  to  Blue 
of  each  lost,  or  destroyed  unit.  These  values  are  shown  in  Table  2  and  were 
used  by  the  aid  to  calculate  EU.  They  wore  discussed  to  give  the  subject 
some  concept  of  the  relative  value  weighting  of  various  units.  In  any  op¬ 
erational  version  of  such  an  aid,  the  user  would  be  able  to  insert  his  own 
values  for  the  various  units,  depending  on  the  tactical  situation,  his  own 
experience,  and  other  factors. 

The  second  tape  made  available  information  that  was  appropriate 
to  each  condition.  The  input  information,  BFA  (Blue  Force  Availability), 
ORAD  (Orange  Air  Defense),  ORGF  (Orange  Ground  Forces),  WAT  (Weath¬ 
er  at  Target),  and  WAC  (Weather  at  Carrier)  were  described  and  demon¬ 
strated  (this  information  was  confined  to  the  tabular  form  for  the  no-aid 
condition).  The  output  information  was  specific  to  conditions.  For  the 
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full-aid  condition,  all  the  output  information  was  described  and  demon¬ 
strated;  BFL  (Blue  Force  Losses),  ORAL  (Orange  Air  Losses),  ORGL 
(Orange  Ground  Losses),  and  EU  (Expected  Utility).  These  were  de¬ 
scribed  and  illustrated  both  as  means  and  delta  biased  uncertainty  bands 
in  both  the  tabular  and  the  graphic  forms.  In  the  expected  utility  condi¬ 
tion,  only  the  information  pertinent  to  the  EU  and  the  related  displays 
was  presented.  Likewise,  in  the  outcome  condition,  the  BFL,  the  ORGL, 
and  the  ORAL  display  were  described  and  illustrated.  Of  course,  in  the 
no  aid  condition  the  output  information  was  not  even  mentioned.  For  the 
no  uncertainty  bands  condition,  virtually  the  same  information  was  dis¬ 
cussed  as  for  the  full-aid  condition  except  that  no  references  were  made 
to  the  measures  of  uncertainty  normally  given  by  the  aid.  Normally,  in 
a  tabular  display,  the  ASTDA  presents  the  mean  and  the  delta  biased  un¬ 
certainty  for  each  strike  launch  time,  e.g. ,  in  the  BFL  display,  e.noug 
other  things,  data  for  the  lort  BFls  at  a  particular  launch  time  might  be 
presented  as  5  (2,  7).  The  first  number  of  the  expression  represents  the 
mean  and  the  numbers  in  the  parenthesis  are  the  lower  and  upper  bounds 
of  the  delta  biased  uncertainty  measure.  In  the  no  uncertainty  condition, 
the  same  data  were  displayed  as  5  (0,  0).  To  explain  this  ambiguity  the 
relevant  tape  made  a  vague  reference  to  a  lack  of  variability.  In  the  nor¬ 
mal  graphic  presentations,  the  means  were  represented  as  points  and  the 
uncertainty  bands  were  represented  as  bars  extending  above  and  below 
the  points.  Only  the  means  (points)  were  shown  and  discussed  in  the  no 
uncertainty  condition. 


Table  2 

Equivalent  Unit  Value  to  Blue  of  the  Destruction 
_ of  a  Single  Force  Unit  of  Each  Type 


Unit  Type 


Value 


OF! 
OF2 
ODl 
OD2  . 
OT1 
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Practice  Session  and  P.lenus 

After  the  two  video  tapes  were  viewed  and  questions  about  proce¬ 
dure,  if  any,  answered,  the  subjects  worked  through  two  practice  scenar¬ 
ios  with  the  help  of  the  evaluation  administrator.  The  only  difference  be¬ 
tween  the  practice  scenario  administration  and  the  test  scenarios--aside 
from  the  participation  of  the  administrator — was  that  in  the  practice  ses¬ 
sions  the  subjects  were  not  required  to  make  judgments  of  confidence  and 
difficulty.  The  practice  sessions  served  to  familiarize  further  the  sub¬ 
jects  with  the  aid,  the  use  of  the  equipment  system,  the  various  displays, 
and  their  task  during  the  formal  data  acquisition. 

The  evaluation  administrator  acted  as  the  system  operator  for  the 
first  practice  problem.  He  demonstrated  how  to  call  displays  by  entering 
the  appropriate  commands.  The  commands  were  single,  three  to  six  let¬ 
ter  words  and  were  organized  into  two  lists  or  menus- -an  input  and  an 
output  menu.  One  list  was  available  at  a  time  and  displayed  at  the  bottom 
of  one  display  screen.  At  the  beginning  of  each  problem,  the  input  menu 
was  available.  It  contained  each  command  used  to  call  up  the  input  dis¬ 
plays,  i.  e. ,  BFA,  ORAL),  ORGF,  WAT,  and  WAC.  In  addition,  the  list 
contained  four  other  commands.  The  first,  a  HELP  command,  produced 
a  list  of  all  the  available  displays  and  the  commands  used  to  call  them. 

The  second,  a  DECIDE  command,  called  the  special  routine  which  allowed 
the  subject  to  record  his  strike  timing  decision(s),  and  to  record  his  es¬ 
timates  of  difficulty  and  of  confidence.  The  third,  a  RUN  command  (not 
available  in  the  no-aid  condition),  removed  the  input  list  and  produced  the 
output  list.  Finally,  a  RETURN  command  removed  the  output  menu  and 
restored  the  input  menu. 

The  output  menus'  contents  varied  across  conditions.  The  output 
menu  contained  commands  for  all  the  output  data  in  the  full  aid  and  the  no 
uncertainty  conditions,  i.  e. ,  BFL,  ORAL,  ORGL,  and  EU.  In  the  outcome 
condition,  the  menu  did  not  contain  the  EU  command  while  in  the  expected 
utility  condition  the  menu  did  not  contain  the  BFL,  ORAL,  and  ORGL  com¬ 
mands. 


After  the  first  practice  scenario  was  finished,  the  fact  that  the  sub¬ 
ject  would  normally  be  questioned  at  the  end  of  each  test  problem  on  his 
confidence  and  the  perceived  difficulty  of  the  problem  was  explained.  A 
clear  and  concise  description  of  the  magnitude  estimation  technique  was 
given  and  the  use  of  the  second  problem  as  a  modulus  was  discussed.  Any 
questions  were  again  answered  in  context.  Then,  the  subject  worked  his 
way  through  this  second  problem  on  his  own.  The  subject  acted  as  the  sys¬ 
tem  operator  and  entered  the  commands.  After  the  subject  entered  his 
strike  timing  decisions,  he  was  asked  whether  he  had  any  additional  pro¬ 
cedural  questions.  Then,  the  formal  data  collection  started. 
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The  subject  then  proceeded  with  the  eight  test  problems  in  the  pre¬ 
determined  random  order.  The  sequence  was  initiated  by  the  administra¬ 
tor  who  called  from  the  system  one  of  five  experimental  versions  of  ASTDA. 
The  version  called  was  appropriate  to  the  evaluative  condition.  After  the 
subject  started,  he  was  left  on  his  own.  The  administrator  stationed  him¬ 
self  at  the  back  of  the  partitioning  screen  where  he  could  unobtrusively  ob¬ 
serve  the  commands  entered  and  displays  called.  The  specialized  testing 
computer  routine  which  was  used  to  present  the  problems  also  created  a 
data  file  for  each  problem.  The  file  stored  timing  factors,  all  the  com¬ 
mands  entered  by  the  subject,  the  subject's  strike  timing  rank  order  deci¬ 
sions,  and  his  confidence  and  difficulty  judgments.  The  routine  measured 
and  stored  the  time  between  a  command  and  when  the  tables  and  graphs  were 
fully  displayed  and  then  measured  the  time  to  the  next  command.  In  addition, 
the  sequential  relationship  of  the  commands  was  preserved.  The  routine  al¬ 
so  asked  the  subject,  at  the  end  of  each  problem,  if  he  wished  to  continue 
with  the  next  problem.  A  "yes"  entry  called  the  next  problem.  When  all 
eight  problems  were  completed,  the  routine  informed  the  subject  that  the 
formal  data  collection  was  finished.  The  net  result  was  that  the  subject 
could  work  at  his  own  pace  without  administrator  intrusion  from  data  col¬ 
lection  start  to  completion. 

After  Data  Collection  Interview 


After  each  subject  completed  the  eight  problems,  an  interview  was 
administered.  The  interview,  generally,  attempted  to  obtain  a  qualitative 
evaluation  of  all  aspects  of  the  aid  including  usefulness  and  workability. 

The  interview  was  semis! ructu red  in  nature  and  inquired  into  three  specif¬ 
ic  topic  areas.  The  first  attempted  to  obtain  the  data  required  for  an  as¬ 
sessment  of  the  aid's  utility.  This  assessment  was  implemented  by  a  multi- 
attribute  utility  analysis.  The  second  part  of  the  interview  consisted  of  an 
evaluation  of  the  usefulness  of  the  aid  and  its  components.  The  final  part 
was  directed  toward  an  evaluation  of  the  organization  and  content  of  the  dis¬ 
play  and  control  systems. 
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The  analysis  of  the  emergent  data  proceeded  in  an  orderly  manner. 
First,  a  criterion  (validity)  analysis  was  completed.  This  analysis  sought 
to  determine  the  degree  of  agreement  between  the  two  available  criteria- - 
expert  panel  judgments  and  expected  utility  as  computed  by  the  ASTDA. 
Then,  the  differences  between  the  various  conditions  were  examined.  Next 
the  data  were  examined  for  learning  effects.  In  addition,  a  multiple  corre¬ 
lation  and  regression  analysis  was  performed  on  the  dependent  variables 
relative  to  the  information  made  available.  Finally,  the  interview  results 
were  examined. 


Criterion  Analysis 

The  criterion  analysis  sought  to  establish  the  relationship,  if  any, 
between  the  two  sets  of  criterion  data.  One  set,  the  EU  criteria  data, 
represented  the  best  predictions  of  the  ASTDA  while  the  second  repre¬ 
sented  the  pooled  judgment  of  experienced  naval  operations  personnel  a- 
bout  the  preferred  launch  time  ranking. 

Specifically,  the  expert  panel  provided,  for  each  problem,  the  best 
two  and  the  worst  two  strike  launch  times  from  a  choice  of  six  potential 
launch  times.  For  each  problem,  the  panelist’s  joint  strike  launch  times 
were  ranked  and  paired  with  the  ASTDA  calculated  expected  utility.  The 
assignment  of  utility  to  the  experts'  decisions  is  illustrated  in  Table  3  for 
a  typical  problem.  The  top  left  part  of  Table  3  contains  the  six  possible 
strike  launch  times,  1200  to  1700  hours  inclusive,  while  next  to  each  is 
the  mean  and  the  range  of  the  expected  utility  calculated  by  the  ASTDA  for 
each  time. 

The  bottom  half  of  the  table  contains  two  sets  of  ranked  times  and 
their  utilities- -on  the  left  those  of  the  ASTDA  and  on  the  right  the  utilities 
for  the  panel  judgments.  The  highest  expected  utility  in  the  example  is 
43.  89  which  is  associated  with  a  1200  hours  launch.  The  lowest  utility  oc¬ 
curs  for  a  1400  hours  launch  and  was  assigned  the  sixth  rank  by  the  ASTDA. 

Under  the  heading  "Panel"  in  Table  3,  the  ranked  judgments  of  the 
panel  appear  along  with  the  associated  utility.  The  result  was  32  paired 
values  (4  launch  times  pc  problem  x  8  problems).  The  raw  score  inter¬ 
correlation  between  the  two  arrays  was  .91  (Figure  3  (A)).  Because  of 
differences  in  the  utility  distributions  within  each  problem,  the  data  for 
each  problem  were  converted  to  normal  deviates  (z  scores)  and  the  cor¬ 
relation  coefficient  was  again  calculated.  The  resultant  correlation  coef¬ 
ficient  was  .  47  (Figure  3  (B)).  These  results  indicate  at  least  a  moderate, 
positive  relationship  between  the  two  measures  and  support  contentions 
favoring  the  validity  of  the  ASTDA. 
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Table  3 


Method  of  Comparing  Utility  of  ASTDA  and 
Expert  Judgment  Launch  Times  for  a  Typical  Problem 


Launch  Time 


Utility  (ASTDA) 


Mean 

Range 

z 

1200 

43.89 

(  9.00. 

69.75) 

1. 3794 

1300 

42.05 

(  6.85. 

68.00) 

1.  1478 

1400 

25.  16 

<  -25.  58, 

61.00) 

-0.9786 

1500 

29.68 

(  -26.21, 

64.00) 

-0.4096 

1600 

28.  84 

(  -29.  67, 

65.00) 

-0.5153 

1700 

27.98 

{  -32.  67, 

65.00) 

-0. 6236 

Best  Times 


Worst  Times 


Aid 

Panel 

Rank 

Time 

Utility 

z 

Time 

Utility 

z 

1 

1200 

43.  89 

1. 3794 

1200 

43.  89 

1.  3794 

2 

1300 

42.05 

1.1478 

1300 

42.05 

1. 1478 

5 

1700 

27.98 

-0. 6236 

1600 

28.  84 

-0.6236 

6 

.  1400 

25. 16 

-0. 9782 

1700 

27.98 

-0.9782 
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The  panel,  for  two  problems,  chose  to  launch  early,  preemptive 
strikes  on  the  basis  of  military  judgment.  This  judgment  encompassed  £> 

factors  not  considered  by  the  aid.  The  decisions  to  launch  preemptive 
strikes  were  made  regardless  of  other  advantages  or  disadvantages,  e.  g. , 
the  number  of  own  (Blue)  aircraft  available,  or  the  number  and  kind  of 
Orange  air  defenses.  The  critical  conditions  seem  to  have  been  a  large 
collection  of  parked  Orange  aircraft  and  intelligence  information  that  an 
attack  was  imminent.  The  panel  thought  a  preemptive,  early  strike  was 
worth  the  additional  loss  of  own  men  and  equipment.  However,  the  ASTDA 
does  not  include  such  considerations  within  its  logic.  This  limitation  has 
implications  for  aid  design  and  is  discussed  later.  Therefore,  it  cannot 
differentially  predict  the  utility  of  preemptive  strikes.  Moreover,  for  the 
problems  involved,  the  preemptive  strike  times  were  fortuitously  undesir¬ 
able  as  compared  with  the  other  times  under  consideration.  For  example, 
for  problem  4,  the  best  time  chosen  by  the  panel  was  evaluated  by  the  aid 
to  have  a  slightly  negative  value.  In  the  other  problem  in  which  preemp¬ 
tive  strikes  were  decided  on  by  the  panel,  problem  6,  the  panel's  rankings 
were  virtually  the  opposite  of  those  yielded  by  the  aid. 

Accordingly,  the  data  sets  for  these  two  problems  were  eliminated 
and  the  correlation  coefficient  was  again  computed  employing  the  normal¬ 
ized  data.  In  Figure  3  (B),  the  circled  points  represent  the  eliminated 
data.  The  resultant  correlation  coefficient  was  .67. 

One  may  also  argue  that  in  actuality  only  one  launch  time  is  possible 
for  a  given  strike.  By  this  argument,  only  the  first  choice  becomes  rele¬ 
vant.  Accordingly,  only  first  choices  were  intercorrelated.  The  resulting 
correlation  coefficient  was  .  99  (N  =  S). 

If  the  panel  derived  data  are  assumed  to  represent  the  criterion  to 
be  predicted  by  the  ASTDA,  then  these  correlation  coefficients  represent  p 

validity  estimates  for  the  aid  itself,  i.  e. ,  the  relationship  between  the  ex¬ 
pert's  judgments  and  the  aid's  prescriptions  concerning  the  problems.  The 
relationship  appears  positive  and  moderate  in  magnitude.  This  relation¬ 
ship  increased  dramatically,  when  only  problems  which  were  solved  by  the 
panel  without  preemptive  strikes  were  considered  and  a  further  increase 
was  demonstrated  when  only  first  choices  were  considered.  q 

Aid  Conditions 


Ranked  Difference  Scores 

To  analyze  the  differences  among  responses  as  a  function  of  the 
various  aid  conditions  (see  Exhibits  I  and  11),  two  indexes  were  used.  The 
first  index  reflected  the  difference  between  the  rankings  produced  by  the 
aid  (utility  rankings)  and  those  assigned  by  the  subjects  for  the  launch  times 
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associated  with  each  problem.  The  second  index  represented  a  measure 
of  the  difference  between  the  panel's  rankings  of  the  launch  times  for  each 
problem  and  the  rankings  of  the  individual  subjects. 
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Exhibit  IV  illustrates  the  calculation  of  ranked  difference  scores. 

In  calculating  these  scores,  only  the  ranks  1,  2,  5,  and  6  were  used  be¬ 
cause  of  limits  on  the  data  obtained  from  the  expert  panel.  In  the  Exhibit 
IV  example,  the  first  ranked  time  by  the  aid  was  1200  hours  while  the 
subject  ranked  1200  hours  fifth,  a  difference  of  four.  The  four  represents 
the  first  value  of  the  ranked  difference  score.  The  aid  ranked  1300  hours 
as  second  best  while  the  subject  assigned  1300  hours  to  the  first  rank; 
therefore,  there  was  a  difference  value  of  one.  This  became  the  second 
value  to  enter  the  ranked  difference  score.  For  the  fifth  ranked  time, 
there  was  a  one  rank  difference  between  the  aid  and  the  subject.  In  the  ee 
sixth  ranked  time,  there  was  a  two  rank  difference.  The  numbers  one  and 
two  were  entered  into  the  ranked  difference  score  for  the  third  and  fourth 
values  (fifth  and  sixth  ranks)  respectively.  Summing  across  the  four  dif¬ 
ference  values  gives  the  ranked  difference  sum  (8)  against  the  aid  (expected 
utility)  criterion. 

A  parallel  technique  was  employed  to  calculate  ranked  difference  sums 
using  the  panel  judgments  as  the  criterion.  This  is  illustrated  in  the  bottom 
half  of  Exhibit  IV  where  the  sum  of  the  difference  scores  for  the  subject  on 
the  individual  problem  is  six. 

The  ranked  difference  scores  are  an  inverse  measures  of  agreement 
with  lower  ranked  difference  scores  indicating  better  agreement.  The  scores, 
as  calculated,  possess  a  range  of  zero  to  16.  A  zero  rank  difference  indicates 
perfect  agreement  and  a  rank  difference  score  of  16  indicates  a  ranking  of 
the  launch  times  by  the  subject  inversely  to  that  of  the  criterion. 

Variance  Analysis  of  Difference  Scores  Using  the  Aid  as  the  Criterion 

The  ranked  difference  scores  based  on  the  ASTDA's  utility  values 
as  the  criterion  were  subjected  to  a  two  (problem  difficulty)  by  five  (aid 
levels)  by  two  (subject  experience)  analysis  of  variance.  The  results  of 
this  analysis  are  presented  in  Table  4. 

The  results  of  the  variance  analysis  indicated  statistically  significant 
variance  due  to  the  aid  level  and  the  problem  difficulty  main  effects.  The 
aid  level  by  problem  difficulty  interaction  was  also  statistically  significant. 

The  mean  values  for  the  various  conditions  are  summarized  below: 


Aid  Level 

Mean 

Full  Aid  (Al) 

2. 20 

Utility  (A2) 

2.  S5 

Outcome  (A3) 

2.96 

No  Uncertainty  (A4) 

2.74 

No  Aid  (A5) 

6.60 
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Exhibit  IV 


Example  of  Calculation  of  Ranked  Difference  Score 


Rankings 


1^ 

2 

3 

4 

5 

6 

Subject 

1300 

1600 

1500 

1400 

1200 

1700 

Aid  (EU) 

1200 

1300 

1700 

1400 

Difference 

4 

1 

1 

2 

Experts 

1300 

1200 

1600 

1700 

Difference 

0 

3 

3 

0 

Ranked  Difference 
Score 


1=8 


1=6 


Table  4 
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Variance  Summary  for  the  Ranked  Difference  S‘ ores--ElJ  Criterion 

Sum  of  Mean 

Source  Squares  tlf  Square  l-1 


Mean 

Squure 


Between 

Experience  (E) 
Aid  Levels  (A) 

E  x  A 


2437.50 

0.00 
1 1 00.  12 

49.49 


4  299.53 
4  12.37 


0.  03 
15.  7(5* 


Error;  Subjects  within  groups  950.40  50 


Difficulty  (D) 

D  x  E 
D  x  A 
D  x  E  x  A 


85.  85 

5.  85 
133.49 
13.  70 


Error;  D  x  Subjects  within  groups  (148.23  50 

*p  *  0.  05 

Table  5 


19.01 

85.  85 

5.  85 
33.  37 
3.43 

1  2.  90 


(1.  (12* 

0.45 
2.57* 
0.  2(1 


Variance  Summary  for  the  Hanked  Difference 

Scores- 

-Panel  Criterion 

Sum  of 

Mean 

Source 

Squares 

df_ 

Square 

F 

9 

Betv  ■  mi 

1907.  15 

Experience  (15) 

21.07 

1 

21.07 

1.  11 

Aid  Levels  (A) 

159.83 

4 

39.  90 

2. 05** 

9 

E  x  A 

14.  28 

4 

3.  57 

0.  18 

Error:  Subjects  within  groups 

974. 37 

50 

19.49 

Difficulty  (D) 

0,07 

1 

0.07 

0.01 

D  x  E 

0.  13 

1 

0.  13 

0,01 

• 

D  x  A 

192. 88 

4 

48.  2" 

4.07* 

D  x  13  x  A 

27.  78 

4 

3.  05 

0.07 

Error;  D  x  Subjects  within  grc 

■  ps  510.13 

50 

\0.  32 

$ 

*p  »  0.  U5 

** approaches  p  =  0.05 
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Difficulty 

Mean 

Hard 

3.  83 

Easy 

3.  07 

Experience 

Operational  Experience 

3.46 

No  Operational  Experience 

3.53 

The  lowest  mean  score,  2.20,  was  observed  in  the  full  aid  condition 
while  slightly  higher  (poorer)  ranked  difference  scores  (2.95,  2.96,  and  2.74) 
resulted  from  the  utility,  the  outcome,  and  the  no  uncertainty  conditions, 
respectively.  The  highest,  most  deviant  difference  score,  6.60,  was  ob¬ 
served  in  the  no  aid  control  condition.  A  Newman-Keuls  a  posteriori  com¬ 
parison  of  the  means  using  the  Studentized  range  statistic  indicated  no  sta¬ 
tistically  significant  differences  among  the  first  four  conditions  but  that  the 
no  aid  condition  differed  significantly  from  the  others. 

In  addition  to  aid  levels,  the  difficulty  of  the  scenario  problems  also 
produced  systematic  variance  differences  in  the  ranked  difference  scores. 
Difference  scores  tended  to  be  higher  (poorer)  when  working  with  the  hard 
scenario  problems.  The  effect  of  difficulty  also  significantly  interacted  with 
aid  levels.  In  Figure  4,  within  aid  levels,  the  mean  ranked  difference  scores 
for  the  hard  scenario  problems  are  consistently  higher  than  those  for  the  easy 
problems  across  the  first  four  aid  levels.  In  the  fifth  aid  level,  the  no  aid 
level,  this  relationship  is  reversed.  A  comparison  of  the  means  through  the 
Newman-Keuls  test  indicated  no  statistically  significant  differences  across 
aid  levels  for  the  hard  problems.  However,  for  the  easy  problems,  the  mean 
ranked  difference  scores  from  condition  A5,  the  no  aid  condition,  differed 
significantly  from  the  means  of  all  the  other  conditions  except  the  utility  con¬ 
dition. 


A  comparison  by  aid  levels  across  difficulty  levels  did  not  indicate 
any  systematic  differences. 

It  appears  that,  in  terms  of  the  ranked  difference  scores  here  involved, 
there  was  little  consistent  tendency  for  the  subjects  to  rank  order  the  launch 
times  different  in  any  condition  in  which  all,  or  parts,  of  the  output  displays 
(BFL,  ORAL,  ORGL,  and  EU)  were  available.  That  is,  not  having  the  outcome 
displays,  or  the  utility  display,  o.  the  uncertainty  bands  did  not  significantly 
affect  the  variance  across  ranks.  Only  when  the  subjects  were  not  given  any 
output  information  in  condition  A5  was  there  any  appreciable  effect  on  their 
choices.  It  is  possible  that  the  information  contained  in  the  various  displays 
is  positively  correlated.  Accordingly,  increased  information  volume  may 
have  contributed  little. 


32 


8 


u. 

5 


AID  LEVEL 


Figure  4.  Tlio  mean  ranked  difference  scores  from  the  EU  criterion 
for  each  aid  and  difficulty  level.  Aid  Levels  (At,  A2,  A3 
A4,  and  A5)  correspond  to  the  full  aid,  utility,  outcomes, 
no  uncertainty,  and  no  aid  conditions,  respectively. 
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The  distribution  of  times  was  also  affected  by  problem  difficulty; 
hard  problems  led  to  a  wider  divergence  from  the  criterion  while  easy 
problems  produced  rankings  closer  to  those  of  the  criterion.  However, 
the  difficulty  by  aid  level  interaction  data  suggest  that  most  of  the  dif¬ 
ferences  associated  with  difficulty  were  produced  by  the  no  aid  condition. 

Variance  Analysis  of  Difference  Scores  Using  the  Panel's  Judgments  as 
the  Criterion 


The  ranked  difference  scores  were  also  analyzed  with  the  export's 
judgments  forming  the  criterion.  The  only  statistically  significant  variance 
was  attributable  to  the  aid  level  by  difficulty  interaction.  Generally,  the 
difference  scores  produced  by  this  analysis  tended  to  be  higher  than  those 
calculated  on  the  basis  of  the  aid  produced  expected  utility  criterion.  The  U 

mean  ranked  difference  score  across  all  conditions  was  3.45  for  the  ex¬ 
pected  utility  but  it  was  6.08  when  calculated  against  the  expert's  judgments. 

Part  of  this  difference  might  be  due  to  the  inclusion  in  the  analyses  of  those 
problems  for  which  the  excerts  chose  to  launch  preemptive  strikes.  In 
addition,  both  of  these  preemptive  problems  were  classified  as  easy.  This 
would  tend  to  inflate  the  ranked  difference  scores  for  the  easy  problems. 

A  summary  of  a  variance  analysis  of  these  data  is  presented  as  Table  5. 

Within  the  analysis  of  variance,  there  were  few  systematic  differences  across 
or  within  conditions.  There  was  a  tendency  toward  significant  effects  across 
the  aid  levels.  The  variance  due  to  experience  or  problem  difficulty  was  not 
statistically  significant. 

The  mean  ranked  difference  scores  for  each  condition  were; 


Aid  Level 

Mean 

Full  Aid  (Al) 

5.40 

Utility  (A2) 

5.40 

Outcome  (A3) 

C.  3 1 

No  Uncertainty  (A4) 

6.72 

No  Aid  (A5) 

6.58 

Difficulty 

Ha  rd 

6.09 

Easy 

6.07 

Experience 

Operational  Experience 

6.47 

No  Operational  Experience 

6.29 

A  comparison  of  the  aided  conditions  with  the  no  aid  condition  seems 
to  suggest  that  in  the  full  aid  condition  the  subjects  tended  to  have  lower 
(better)  scores  than  in  the  no  aid  condition.  The  utility  condition  (A2)  yielded 
scores  more  like  the  full  aid  (Al)  and  the  no  uncertainty  condition  (A4)  pro¬ 
duced  scores  more  like  the  no  aid  condition. 

34 


© 


gagas 


#ysgg-g; 


The  subjects  tended  to  produce  better  (tower)  rank  difference  scores 
in  the  full  aid  and  the  utility  conditions  than  for  the  other  aid  levels  involved. 

The  statistically  significant  interaction  effect  is  plotted  in  Figure  5. 

Regret 

The  prior  analyses  depended  on  the  sensitivity  of  the  ranked  differ¬ 
ence  scores  to  the  various  evaluation  conditions.  It  is  possible  that  scoring 
on  the  basis  of  launch  time  rankings  may  have  obscured  real  differences  be¬ 
tween  strike  times  and  not  accurately  weighed  the  consequences  of  differences 
among  the  rankings.  Consider  the  case  in  which  the  best  strike  time  has  a 
utility  of  50  and  the  second  besi  time  has  a  utility  of  49.  Hanked  that  way  by 
the  subject  and  the  criteria,  the  result  is  a  rero  ranked  difference  score. 

If  the  subject  reversed  the  times  so  that  he  assigned  a  rank  of  one  to  the 
time  with  a  utility  of  49  and  a  two  to  the  time  with  a  utility  of  50,  then  a  one 
would  be  generated  as  the  ranked  difference  score.  But  is  a  difference  of 
one  in  utility  units  equivalent  to  a  rank  difference  of  one?  If  the  second  best 
time  according  to  the  aid  had  a  utility  of  40,  would  such  a  difference  represent 
an  important  difference?  if  a  subject  ranked  as  best  the  time  with  the  40 
utility  and  as  second  best  the  time  with  the  50,  should  this  inversion  be  given 
the  same  weight  as  the  difference  between  49  and  50? 

It  seemed  possible  that  a  "regret  analysis"  would  allow  for  a  more 
sensitive  evaluation  of  the  differences  between  utilities.  The  regret  score 
is  defined  as  the  difference  between  the  utility  associated  with  die  time 
specified  by  the  aid  and  the  utility  associated  with  the  t'me  chosen  by  the 
subjects.  If  the  best  time  predicted  by  the  aid  had  a  utility  of  50  and  the 
best  time  chosen  by  a  subject  had  a  utility  of  45,  then  the  difference  between 
these  utilities,  5,  represents  the  regret  score.  Regret  scores  were  calcu¬ 
lated  in  order  to  assess  differences  in  utility  value  between  times  indicated 
by  the  aid  and  those  chosen  by  the  subjects.  The  regret  scores  were  only 
calculated  for  the  best  launch  time  because  it  was  thought  that  second  or 
third  best  times  represented  academic  issues  of  minimum  consequence  to 
the  operational  situation. 

Variance  Analysis  of  Regret  Scores 

The  regret  scores  for  the  expected  utility  criterion  were  subjected 
to  a  two  by  five  by  two  variance  analysis.  The  differences  between  pre¬ 
dictions  made  by  the  aid  and  the  subjects  choice  times  were  first  examined. 
Tile  analysis  indicated  significant  systematic  variance  across  aid  levels, 
difficulty  levels,  and  their  interactions.  These  differences  parallel  those 
which  resulted  from  the  analysis  of  the  ranked  difference  scores.  In  ad¬ 
dition,  there  was  also  a  tendency  toward  significant  differences  across 
experience  levels  (operationally  experienced/ operationally  inexperienced). 

The  summary  of  this  variance  analysis  is  presented  as  Table  li. 
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o  hard  problems 

•  EASY  PROBLEMS 


Figure  5.  Mean  ranked  difference  scores  from  the  panel  criterion  for 
each  aid  and  difficulty  level.  Aid  levels  {A1,  A2,  A3,  A4, 
and  A5)  correspond  to  the  full  aid,  utility,  outcome,  no 
uncertainty,  and  no  aid  conditions,  respectively. 


Table  6 


i 

i 
| 

!  Variance  Summary  for  the  Regret  Scores--EU  Criterion 

| 

‘  i 


ft 

Sum  c-f 

Mean 

Source 

Squares 

df 

Square 

F 

Between 

4718.  67 

Experience  (E) 

40.  84 

1 

40.  84 

2.  67* 

Aid  Levels  (A) 

1620. 11 

4 

405. 03 

26.48** 

ft 

E  x  A 

70.69 

4 

17.67 

1.  16 

Error*  Subjects  within  groups 

764.92 

50 

15.30 

Difficulty  (D) 

360. 45 

1 

360.45 

20.98** 

ft 

D  x  E 

5.00 

1 

5.0 

0.29 

D  x  A 

962.46 

4 

240.61 

14. 01** 

D  x  E  x  A 

35.  35 

4 

8.84 

0.51 

Error*  D  x  Subjects  within  groups 

858.67 

50 

17.17 

o 

*  tendency  toward  significance,  p  =  0. 

109 

**p  =  0.  05 

O 


c 
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The  mean  regret  scores  for  the  various  conditions  were: 


Aid  Level 

Mean 

Full  Aid  (Al) 

0.51 

Utility  (A2) 

1.58 

Outcome  (A3) 

1.09 

No  Uncertainty  (A4) 

0.  97 

No  Aid  (A5) 

5.57 

Difficulty 

Hard 

1.07 

Easy 

2.81 

Experience 

Operational  Experience 

1.64 

No  Operational  Experience 

2.23 

The  scores  from  the  first  four  aid  levels  tended  to  be  rather  similar 
with  the  lowest  (best)  score,  0.51,  resulting  from  the  full  aid  condition  and 
the  highest  (worst)  score,  1.53,  resulting  from  the  utility  condition.  The 
mean  regret  score  for  the  no  aid  condition  (5.57)  was  considerably  higher 
(worse)  than  that  for  the  remaining  levels.  A  Newman-Kculs  test  indicated 
that  the  mean  regret  score  for  the  no  aid  condition  varied  significantly  from 
the  rest  while  no  statistically  significant  differences  occurred  among  the 
means  of  the  other  conditions. 

The  operationally  experienced  group's  mean  regret  score  was  1.64. 
This  value  was  substantially  lower  (better)  than  the  inexperienced  group's 
mean  regret  score  of  2.  23.  The  mean  difference  represents  a  tendency 
towards  statistical  significance. 

The  difficulty  of  the  scenario  problems  also  affected  the  regret  scores. 
When  the  problems  were  hard,  the  regret  scores  wore  fairly  low  with  a  mean 
of  1.07  and  when  they  were  easy  the  scores  were  significantly  higher  with  a 
mean  of  2.  81. 

In  addition  to  the  significant  main  effects,  the  first  order  interaction 
was  statistically  significant.  The  interaction  effect  is  shown  in  Figure  6 
which  also  indicates  that  some  form  of  aiding  decreased  regret  scores  eon- 
siderably--especially  for  easy  problems. 

The  easy  and  the  hard  problems  produced  significant  differential  ef¬ 
fects  within  the  aid  levels.  An  examination  of  the  means  indicated  that  in 
the  full  aid,  the  outcome,  and  the  no  uncertainty  conditions,  the  regret 
scores  varied  little  across  difficulty  levels.  However,  in  the  other  two  con¬ 
ditions,  utility  and  no  aid,  there  were  noticeable  increases  in  the  regret 
scores  of  the  easy  problems  as  compared  with  the  hard  problems.  A 


Figure  6.  Mean  regret  score  for  the  EU  criterion  for  each  aid  and 
difficulty  level.  Aid  levels  (A1,  A2,  A3,  A4,  and  A5)  cor¬ 
respond  to  the  full  aid.  utility,  outcomes,  no  uncertainty, 
and  no  aid  conditions,  respectively. 
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Newman-Keuls  analysis  indicated  no  systematic  mean  differences  among 
hard  problems  across  aid  levels.  On  the  other  hand,  for  the  easy  problems 
the  mean  regret  scores  from  the  no  aid  condition  (A5)  was  significantly 
higher  than  the  others.  A  comparison  across  difficulty  within  each  aid  level 
indicated  that  the  only  statistically  significant  difference  was  in  the  no  aid 
condition. 

These  effects  of  difficulty  seem  to  be  reversed  from  the  trend  for 
the  ranked  difference  scores.  Difficulty  decre-.sed  the  regret  scores  for 
the  hard  problems  and  increased  it  for  the  easy  problems,  while  the  op¬ 
posite  effect  was  noted  for  the  ranked  difference  scores.  This  seeming 
contradiction  may  be  resolved  by  comparing  the  aid  level  by  difficulty  in¬ 
teraction  data.  The  aposteriori  comparison  of  means  indicated  that  the 
only  significant  effect  of  difficulty  for  both  the  regret  scores  and  the  ranked 
difference  scores  was  due  to  the  no  aid,  easy  problem  condition.  Accordingly, 
we  believe  the  seeming  reversal  to  be  due  to  the  no  aid  condition.  As  such, 
the  difference  is  artifactual  or  not  of  immediate  interest  to  an  evaluation  of 
the  ASTDA. 

Learning 

It  is  possible  that  as  a  subject  worked  his  way  through  the  problems, 
he  may  have  learned  some  important  aspect  about  the  use  of  the  aid,  the 
variables  involved,  and  the  context.  Such  learning  might  be  expressed  as 
an  increasing  approximation  to  the  predictions  of  the  aid. 

To  evaluate  this  possibility,  the  successive  ordering  of  problems 
for  each  subject  was  recovered  and  the  problem  set  was  divided  into  halves. 
The  early  half  consisted  of  the  first  four  problems  the  subject  completed 
and  the  late  half  consisted  of  the  last  four.  The  performance  measures 
used  to  evaluate  learning  effects  were  the  ranked  differences  scores  from 
the  aid  and  from  the  experts.  A  separate  analysis  of  variance  was  com¬ 
pleted  for  each  set  of  criterion  data. 

Results 

A  summary  of  the  variance  analysis  for  learning  effects  employing 
the  aid  computed  utility  values  as  the  criterion  is  presented  as  Table  7 
and  a  parallel  summary  employing  the  panel's  judgments  as  the  criterion 
is  presented  as  Table  8.  The  analysis  employing  the  expert  judgment  cri¬ 
terion  failed  to  indicate  any  statistically  significant  differences.  The  anal¬ 
ysis  employing  the  aid  calculated  utility  criterion  indicated  a  statistically 
significant  main  effect  due  to  aid  levels  and  a  tendency  towards  a  signifi¬ 
cant  three  way  interaction.  The  main  effect  result  is  not  pertinent  to  the 
learning  question.  The  interaction  data  are  presented  in  Figure  7.  Scores 
tended  to  decrease  (improve)  in  two  conditions  for  the  experienced  subjects 
and  in  three  conditions  for  the  inexperienced.  Also,  the  magnitude  of  the 
variance  within  aid  levels  was  much  more  pronounced  for  the  inexperienced 
subjects. 
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Table  7 


Variance  Summary  for  the  Learning  Data--EU  Criterion 


i 


0 


Sum  of 

Mean 

Source 

Squares 

df 

Square 

F 

Between 

3010.94 

Experience  (E) 

2.00 

1 

2.00 

0.  10 

Aid  Levels  (A) 

1054.04 

4 

263.  51 

13.  37** 

E  x  A 

59.42 

4 

14.  86 

0.  75 

Error:  Subjects  within  groups 

985.10 

50 

19.  70 

Learning  (L) 

5. 42 

1 

5.42 

0.40 

L  r.  E 

8.27 

1 

8.27 

0.  61 

L  x  A 

99.01 

4 

24.75 

1.83 

L  x  E  x  A 

122. 1584 

4 

30.  54 

2.26* 

Error;  L  x  Subjects  within  groups 

967. 52 

50 

13.51 

rH 

« 

o 

II 

a 

Hr 

**p  =  0,05  (difference  not  r 

■elevant  to  question  of  learning) 

Table  8 

Variance  Summary  for  the  Learning  Data— Panel  Criterion 


Sum  of 

Mean 

Source 

Squares 

_df 

Squaiv 

F 

Between 

2544.46 

Experience  (E) 

1.32 

1 

1.32 

0.08 

Aid  Levels  (A) 

83.  32 

4 

20.  83 

1.24 

E  x  A 

59.43 

4 

14.  86 

0.88 

Error;  Subjects  within  groups 

839. 50 

50 

16.  79 

Learning  (L) 

5.  37 

1 

5.  37 

0.  19 

L  x  E 

7.  82 

1 

7.82 

0.28 

L  x  A 

53.  37 

4 

13.  34 

0.48 

L  x  E  x  A 

107.29 

4 

26.  82 

0.97 

r-iovt  L  ::  Subjects  within  groups 

1387. 03 

50 

27.74 
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Given  the  design  of  the  aid  and  the  kinds  of  information  it  supplies, 
there  was  little  to  learn  about  the  use  of  the  aid  or  interpretation  of  its 
output.  There  may  have  been  an  increase  in  confidence  in  the  ai  ‘*s  as¬ 
sessments  with  time.  However,  this  is  not  the  factor  of  interest  here. 

The  resuits  suggest  that  there  was  little  difference  in  the  scores  across 
successive  {early  versus  late)  problems. 

Policy  Capturing 

For  each  of  eight  problems,  each  subject  was  required  to  decide 
on  the  ranking  of  six  possible  launch  times.  To  derive  his  decision,  the 
subject  had  two  sources  of  information  available.  "Objective”  information 
was  supplied  by  the  aid.  (Although  objectivity  does  not  necessarily  imply 
factuality  01  accuracy.)  The  subject  also  relied  on  his  personal  intuition 
which  might  be  called  his  cognitive-operational-emotive  perception,  strat¬ 
egies,  or  schemes. 
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We  are  concerned  here  only  with  the  objective  information  because 
of  our  emphasis  on  evaluating  the  aid  itself.  An  understanding  of  wha*  aid 
produced  information  the  subjects  emphasized  and  what  weight  they  attached 
to  it  would  provide  insight  into  their  decision  process.  Such  information 
might  also  provide  insight  into  design  requirements  for  such  aids. 

The  relationship  between  the  subjects'  choice  of  strike  launch  times 
and  the  objective  information  available  was  analyzed.  This  analysis  was 
performed  by  use  of  the  multiple  regression  technique.  Such  an  approach 
has  been  termed  "policy  capturing"  by  others  because  it  essentially  re¬ 
veals  the  policy  follow  >d  by  the  subjects  in  deriving  their  decisions  (Christal, 
1968). 

One  dependent  and  eleven  independent  variables  were  included  in  the 
analysis.  The  values  of  the  dependent  variable  were  the  rankings  of  the 
best  two  and  the  poorest  two  launch  times.  The  values  assigned  to  the  in¬ 
dependent  variables  each  represented  a  value  derived  from  the  various 
displays:  the  BFA,  ORAD,  WAT,  WAC,  BFL,  ORAL,  and  EU  displays. 

The  information  fro:1'  the  other  two  displays,  the  ORGF  and  ORGL  was 
separated  into  four  stis  of  data:  the  Orange  ground  defenses  (ORGD),  the 
Orange  ground  targets  (ORGT),  the  Orange  ground  defense  losses  (ORGL), 
and  the  Orange  targets  destroyed  (ORTD).  The  assignment  of  values  to  the 
dependent  and  independent  variables  is  illustrated  in  Table  9  for  a  hypothetical 
subject.  The  two  best  and  the  two  poorest  times  are  listed  on  the  left  of  the 
table.  Instead  of  using  the  times  as  the  values  of  the  dependent  variable, 
rankings  were  consistently  assigned.  A  "4"  was  assigned  the  first  choice 
and  a  "1"  was  assigned  to  the  poorest  time. 
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Corresponding  to  a  particular  time  chosen  by  the  hypothetical 
subject,  the  values  of  the  eleven  independent  variables  are  given.  Those 
represent  the  data  available  to  the  subject  vis-u-vis  each  time.  When 
the  displays  did  not  supply  a  single  reference  number  for  a  particular 
time,  as  was  the  ease  for  the  ORGT,  WAT,  WAC,  and  1DU  displays,  the 
sum  of  the  different  unit  types  was  used  as  the  appropriate  value  of  the 
dependent  variable,  e.g. ,  the  BFA  value  of  30.  9  for  1700,  in  Table  9,  is 
the  sum  of  the  number  of  BFls,  BBls,  and  13132s  (various  aircraft,  types) 
available  at  that  particular  time.  A  positive  relationship  existed  for  most 
independent  variables  between  the  perceived  goodness  and  the  value  of  the 
independent  variables.  For  example,  as  the  number  of  blue  aircraft  avail- 
able  (BFA)  increased,  the  perceived  goodness  to  blue  similarly  increased. 
However,  for  the  OR  AD,  ORGD,  and  13FL  displays,  the  reverse  was  true. 
Accordingly,  the  reciprocals  of  the  value  for  each  of  those  independent  var¬ 
iables  was  entered  into  the  analysis. 

A  separate  multiple  regression  analysis  was  conducted  for  various 
conditions,  i.  e. ,  by  experience  of  subject,  problem  difficulty,  aid  levels, 
and  combin' tions.  An  overall  analysis,  with  data  collapsed  across  all 
conditions,  was  also  computed.  Bach  analysis  was  stop  wise.  Cut  off  cri¬ 
teria  of  F  ”2.00  and  a  tolerance  of  0.  1  were  established. 

Overall  Analysis 

In  the  overall  regression  analysis  the  data  were  collapsed  across  aid, 
difficulty,  and  background  levels.  The  results  are  presented  in  Table  10. 

The  Kero  order  correlation  matrix  for  each  multiple  regression  analysis 
is  found  in  the  Appendix  to  this  report). 

The  first  four  variables  entered  into  the  equation  represented  ASTDA 
input  display  information  and  the  next  three  represented  output  display  in¬ 
formation.  The  first  variable  to  enter  the  equation,  the  WAT,  yielded  an 
R  of  .52  and  accounted  for  27  percent  of  the  total  variance  or  about  70  per¬ 
cent  of  '‘'o  predictable  variance.  Bach  of  the  other  variables,  as  they  were 
entered,  accounted  for  a  lesser  amount  of  variance.  The  range  for  the  suc¬ 
cessive  variables  was  from  about  •!  percent  of  total  variance  for  the  ORAD 
reciprocal  to  less  than  .5  percent  for  the  BFA.  Overall,  the  multiple  cor¬ 
relation  between  the  decisions  of  the  subjects  and  the  variables  entered 
into  the  equation  was  ,61,  accounting  for  33  percent  of  the  total  variance. 

One  may  possibly  assume  that  the  remaining  variance  can  be  accounted  for 
in  some  part  by  the  oognitivo-oporatioaal-omotivo  experience  of  the  sub¬ 
jects.  Generally,  the  weights  and  order  of  entry  of  variables  into  the 
equation  seem  to  be  reasonable.  The  strongest  influence  on  choice  was  ap¬ 
parently  exerted  by  the  WAT  and  the  ORAD.  Other  variables  tended  to  in¬ 
fluence  the  dependent  variuble  very  little.  Note  also  that  the  first  four  vari¬ 
ables  to  enter  were  based  on  enemy  posture/  conditions  rather  than  on  the 
friendly  (blue)  situation. 


Experience 


A  separate  regression  analysis  was  completed  in  which  Ihe  ex¬ 
perience  of  the  subjects  was  fractionated.  The  resultant  data  are  pre¬ 
sented  in  Table  11.  Background  did  not  seem  to  produce  any  important 
observable  differences  either  between  groups  or  as  compared  to  the  data 
from  the  overall  analysis.  The  Rs,  R^s,  changes,  B  weights,  and 
constants  were  remarkably  similar.  The  variables  and  their  order  of  en¬ 
try  from  the  separate  experienced  and  inexperienced  groups  were  similar 
to  those  from  the  overall  analysis.  A  comparison  of  the  results  for  the  ex¬ 
perienced  and  the  inexperienced  groups  indicates  only  minor  differences  in 
the  order  in  which  the  variables  entered  the  equations.  The  variance  anal¬ 
ysis  also  failed  to  indicate  consistent  differences  across  experience  levels. 

It  seems  that,  at  least  for  the  conditions  of  the  present  evaluation,  operational 
experience  was  not  a  substantial  influence  on  either  accuracy  or  on  (he  de¬ 
cision  making  policy. 


Difficulty  I 

The  data  were  separated  by  problem  difficulty  assignment  and  sim-  f 

ilarly  analyzed.  The  analysis  of  the  hard  problems  (Table  12)  indicated 
that  the  first  two  variables  to  enter  the  equation  were  the  BEL  reciprocal 
and  the  EU.  This  was  the  first  regression  analysis  in  which  the  variables 
derived  from  ASTDA  output  information  entered  early  and  the  first  time  1 

that  EU  entered  the  equation  at  all.  In  the  analysis  of  the  data  from  the  easy 
problems,  the  ordering  of  the  variables  was  similar  (but  not  congruent)  with  9  l 

that  observed  in  the  overall  analysis.  That  is,  the  first  two  variables  en¬ 
tered  (the  WAT  and  the  ORGD  rociprical)  were  from  ASTDA  input  displays.  1 


There  were  also  substantial  differences  in  the  amount  of  variance 
accounted  for  between  the  easy  and  the  hard  problems.  In  all,  five  variables 
were  entered  into  the  multiple  regression  equation  for  the  hard  problems  and 
a  multiple  R  of  .62  accounting  for  30  percent  of  V a  variance  was  produced. 
However,  the  multiple  R  for  the  easy  problems  was  higher,  (.75)  and  ac¬ 
counted  for  57  percent  of  the  total  variance. 


This  seems  to  suggest  that  differences  existed  in  the  human  informa¬ 
tion  processing  for  the  easy  and  the  hard  problems  and  in  the  consistency 
with  which  the  information  was  used.  On  the  one  hand,  for  the  hard  prob¬ 
lems,  the  data  suggest  that  the  subjects  tended  to  make  choices  in  line  with  „ 
specific  sources  of  output  information--the  BEL  reciprocal  and  the  BU-- 
which  together  accounted  for  28  percent  of  the  variance.  Then,  they  ap¬ 
parently  qualified  their  choices  by  considering  specific  input  information 
supplied  by  the  ORGD,  ORGT,  WAT,  and  WAG  displays.  On  (he  other  hand, 
when  working  with  easy  problems,  the  major  correlates  of  the  decisions 
seem  to  have  been  input  information  from  ihe  WAT  and  the  ORGD,  which  to¬ 
gether  accounted  for  52  percent  of  the  total  variance.  The  decisions  appear 
to  bo  further  qualified  by  considering  other  sources  of  both  input  and  output 
information:  the  ORGT,  WAG,  BEL,  ORGL,  and  ORAL. 
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Table  10 


Results  of  Overall  Regression  Analysis 

Order  of  Variable 

Multiple 

R2 

R2 

B 

A 

Entering  Equation 

R 

Change 

Weight 

Constant 

WAT 

.5185 

.2688 

.2688 

2.00 

ORADR* 

.5548 

.  3078 

.0389 

7.25 

ORGT 

.5639 

.  3180 

.0101 

-0.03 

ORGDR* 

.  5722 

.  3274 

.  0094 

-33.9  7 

BFLR* 

,5908 

.  3490 

.021? 

17.73 

ORTL 

.6070 

.  3684 

.0194 

0.  18 

ORGL 

.6101 

.  3723 

.0038 

-1.50 

BFA 

.6131 

.  3759 

.0037 

-0.04 

-1. 10 

''’Variables  entered  as  reciprocals;  R  =  reciprocal 


Table  11 


Results  of  Regression  Analysis  by  Background 


Experienced 

Order  of  Variable 

Multiple 

R2 

B 

A 

Entering  Equation 

R 

R2 

Change 

Weight 

Constant 

WAT 

.5138 

.  2640 

.  2640 

1.70 

ORADR" 

.  5008 

,  3034 

.  0394 

6.  14 

ORGT 

,5636 

.  3177 

.0143 

-0.  03 

ORGDR* 

.5705 

.  3255 

.007? 

-34. 30 

BFLR" 

.5910 

.  3493 

.  0238 

18.  00 

ORTL 

,6073 

.  3688 

.0195 

0,21 

ORGL 

.  6112 

.  3736 

.0048 

-1.  61 

BFA 

.6133 

.  3751 

.0025 

0.03 

-.62 

Inexperienced 

WAT 

.5233 

.  2738 

.  2738 

2,  30 

ORADR" 

.5589 

.  314? 

.  0336 

8.40 

ORGDR* 

.5670 

.  3215 

.0090 

-33.70 

BFLR" 

,5862 

.  3437 

.  0022 

17.49 

ORGL 

.6029 

.  3636 

.  0199 

0.  15 

ORTL 

.  6079 

,  3695 

.0059 

-1.41 

ORGT 

.6100 

,  3721 

.  0026 

-0.02 

BFA 

,6142 

.  3773 

.0051 

0,05 

"Variables  entered  as  reciprocals;  R=  reciprocal 


-1.46 


Table  12 


Results  of  Regression  Analysis  by  Problem  Difficulty 
Hard  Problems 


Order  of  Variable 

Multiple 

R2 

B 

Entering  Equation 

R 

R2 

Change 

Weight 

BFLR* 

.4426 

.  1959 

.  1959 

28.  75 

EU 

.5303 

.2812 

.0853 

0.  10 

ORGDR* 

.5537 

.  3066 

.  0253 

-74.53 

ORGT 

.  6060 

.  3608 

.  0542 

-0.  12 

WAT 

.6162 

.  3798 

.0190 

-5.  63 

WAC 

.6174 

.  3812 

.0015 

1.56 

Easy  Problems 

WAT 

.6724 

.4521 

.4521 

5,  63 

ORGDR* 

.7229 

.5226 

.0706 

-25. 70 

BFLR* 

.7410 

.  5492 

.  0265 

6.77 

ORGT 

.7450 

.5551 

.0059 

0.04 

ORGL 

.7494 

.5616 

.0065 

-2.  34 

WAC 

.7510 

.5641 

.  0025 

1.  35 

ORAL 

.7521 

.5657 

.0016 

0.07 

A 

Constant 


5.  82 


-3.43 


^Variables  entered  as  reciprocals;  R  =  reciprocal 
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Aid  Levels 


Table  13  summarizes  the  regression  analyses  completed  in  regard 
to  aid  levels.  There  was  a  sharp  dichotomy  between  aided  conditions  (in 
which  all  or  part  of  the  output  displays  were  available)  and  the  no  aid  con¬ 
dition.  In  the  no  aid  condition,  only  minimal  variance  (R  =  .  37;  variance 
accounted  for  =  13  percent)  was  accounted  for  by  the  multiple  regression 
equation.  This  seems  to  suggest  little  consistency  among  these  subjects 
in  how  they  solved  the  problems  producing  a  distribution  of  choices  that 
was  almost  uncorrelated  with  any  of  the  independent  variables  included  in 
this  analysis. 

This  was  not  the  case  for  the  aided  conditions  where  the  variance 
accounted  for  was  much  higher.  The  variance  accounted  for  was  highest 
in  the  full  aid  conditions  in  which  54  percent  of  the  total  variance  was  iden¬ 
tified.  Lower  amounts  of  variance  were  accounted  for  in  the  utility,  out¬ 
come,  and  no  uncertainty  conditions,  (46,  44,  and  46  percent  respectively). 
Consistently,  the  first  two  variables  entering  the  equation  were  the  WAT  and 
the  ORAD.  These  two  variables,  together,  accounted  for  between  76  and  82 
percent  of  the  total  variance  that  was  accounted  for. 

Other  regression  analyses  were  completed  on  the  data.  These  anal¬ 
yses  involved  aid  by  background,  aid  by  difficulty,  background  by  difficulty, 
and  aid  by  background  by  difficulty.  The  results  produced  multiple  regres¬ 
sion  equations  which  were  very  similar  to  those  reported. 

Discussion  of  Regression  Analyses 

By  and  large,  the  most  powerful  single  correlate  of  choice  was  the 
weather  at  the  target  followed  by  information  about  the  enemy  air  defenses. 
This  generalization  is  mitigated  when  the  effects  of  difficulty  are  considered. 
For  hai'd  problems  own  losses  and  expected  utility  were  strongly  related  to 
choice.  Hence,  difficulty  level  seems  to  act  as  a  moderating  variable  on 
decision  policy. 

It  seems  that  two  fairly  distinct  sets  of  information  were  used  when 
solving  easy  as  compared  with  hard  problems.  Emphasis  in  solving  easy 
problems  was  based  primarily  on  input  information.  For  hard  problems, 
output  information  played  a  somewhat  greater  role.  The  result  was  some¬ 
what  variant  regression  solutions  for  the  two  problem  types. 

Hence,  the  aid  configuration  which  is  best  for  one  problem  difficulty 
level  may  not  be  best  for  another  difficulty  level.  The  analyses  of  the  aid 
levels  indicated  differences  between  aided  conditions  and  no  aid  conditions. 
There  were  differences  among  the  variables  entering  the  equation  and  in 
the  variance  accounted  for.  The  two  variables  which  were  most  highly 
correlated  with  the  decisions,  WAT  and  ORAD,  were  available  in  all 


Table  13 


Results  of  Regression  Analyses  for 


Various  Aid  Conditions 


Order  of  Variable 
Entering  Equation 

WAT 

ORADR* 

ORGT 

ORGDR* 

BFLR* 

BFA 

ORGL 

ORTL 


Full -Aid 


Multiple 

R 

R2 

.6094 

.  3714 

.6513 

.4241 

.6677 

.4458 

.6752 

.4559 

.7058 

.4982 

.7278 

.  5296 

.7316 

.  5351 

.7343 

.5391 

R' 


Change 


B 

Weight 


•3714  I.34 

.0527  7.  82 

.0217  -0.04 

.0101  -43.05 

.0423  23.  84 

.0314  0.06 

.0056  -1.80 

.  0039  0. 18 


WAT 

OR ADR* 

ORGL 

ORGT 

WAC 

ORGDR* 

EU 

BFLR* 


Utilities 


.5525 

.3052 

.6045 

.  3654 

.6244 

.3899 

.  6331 

.4043 

.6372 

.4060 

.6403 

.4100 

.  6565 

.4310 

.  6809 

.4636 

.3052  1.  16 

.0602  -0.39 

.0245  -4.29 

•OHO  -0.03 

.0051  -1.08 

.0040  -34.64 

.  0209  0. 05 

.0326  13.64 


WAT 
OR ADR* 
ORGDR* 
BFLR* 
ORTL 


Outcome 


.5755 
.  6016 
.6142 
.  6321 
.  6645 


.  3313 
.  3620 
.  3773 
.  3996 
.  4416 


.  3313 
.  0307 
.  0153 
.  0223 
.  0421 


0.  98 
-10.  28 
-37. 83 
18.  32 
0.  36 


WAT 

ORADR* 

ORGDR* 

BFLR* 

ORTL 

ORGT 

WAC 


No 

Uncertainty 

.  5677 

.  3223 

.  5874 

.  3451 

.5999 

.  3599 

.6243 

.  3898 

.  6687 

.4471 

.6745 

.4547 

.  6778 

.4593 

.  3223  1.55 

.0228  -7 

.0148  -43. 

.  0298  19.  86 

.0573  0.42 

.0076  -0.03 

•  0045  -1.36 


A 

Constant 


-1. 19 


2.81 


0.  19 


1.29 
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Table  13  (cont, ) 


II 


©> 


N.o  Aid 


Order  of  Variable 
Entering-  Equation 

Multiple 

R 

R^ 

R2 

Chance 

B 

Weight 

WAC 

.2891 

.0835 

.0835 

2.  14 

BFLR* 

.  3294 

.  1085 

.0249 

8. 18 

ORGDR* 

.  3584 

.  1284 

.0199 

-11.46 

ORGL 

..3658 

.  1338 

.  0054 

-1.  11 

^Variables  entered  as  reciprocals;  R  =  reciprocal 


* 


ft 


A 

Constant 


0.29 


conditions.  That  WAT  and  ORAD  were  not  used  consistently  in  the  no  aid 
condition  might  be  understood  on  the  basis  that  the  subjects  in  the  no  aid 
condition  did  not  have  the  parts  of  the  aid  that  served  feedback  functions. 
Parhaps,  WAT  and  ORAD  exerted  their  influence  in  the  aided  conditions 
because  when  output  information  was  available  the  effects  of  these  vari¬ 
ables  could  be  clearly  understood.  That  is,  the  output  may  have  served  a 
feedback  function,  sensitizing  subjects  to  the  effects  of  the  input  information. 
Sensitive  to  these  effects,  the  subjects  may  have  placed  their  emphasis  on 
WAT  and  ORAD.  The  lack  of  the  feedback  mechanism  in  the  no  aid  condi¬ 
tion  may  have  served  to  prohibit  the  subjects  from  being  sensitive  to  the  vari¬ 
ables.  The  subjects  may  have  assessed  the  situation  by  some  subjective 
criterion  resulting  in  distributions  of  choices  which  were  not  strongly  cor¬ 
related  with  any  of  the  independent  variables  included  in  the  regression 
analyses. 

The  concept  of  the  aid  as  a  feedback  mechanism  may  also  account 
for  the  greater  variance  accounted  for  in  the  full  aid  condition  as  compared 
with  the  partially  aided  conditions.  The  three  types  of  information  avail¬ 
able  (the  outcomes,  expected  utilities,  and  statements  of  variability  in  the 
predictions)  may  have  complemented  one  another  producing  more  sensitive 
feedback  functions  and  hypotheses  than  occurred  with  less  complete  com¬ 
binations  of  information.  Again,  this  may  have  enhanced  the  tendency  to 
correlate  choices  with  certain  specific  classes  of  information,  either  WAT 
and  ORAD,  or  BFL  and  EU,  depending  on  the  nature  of  the  problem. 

Examination  of  Merit 

The  merit  of  the  ASTDA  may  be  specified  as  an  estimate  of  de¬ 
cision  quality  when  the  full  aid  was  used  as  compared  with  the  decision 
quality  in  the  no  aid  condition.  Decision  quality  may  be  defined  as  the  re¬ 
lationship  between  decisions  made  by  the  evaluation  subjects  and  those 
made  by  the  expert  panel  in  each  condition.  Specifically,  the  correlation 
(and  the  variance  accounted  for)  between  the  experts'  judgments  and  the 
subjects'  choices  in  the  fully  aided  condition  versus  the  no  aid  condition 
may  be  employed  to  yield  a  measure  of  merit  for  the  ASTDA. 

To  this  end,  a  number  of  product  moment  correlation  coefficients 
were  calculated.  These  were  based  on  the  utility  value  associated  with 
the  best  launch  time  for  each  problem  specified  by  the  experts  and  that 
chosen  by  the  subjects.  The  resultant  correlation  coefficients  are  shown 
below: 


Full  Aid 
.39 


All  Problems 


No  Aid 


%r 


I 


Without  Preemptive  Problems 

r  . 87  . 33 

Mean  .71  .30 


* 


t 


O 


© 


* 


All  data  were  normalised  prior  to  calculating  the  correlation  coef¬ 
ficients  and  the  mean  r  values  were  calculated  with  the  normal  z  trans¬ 
formation.  The  mean  correlation  between  the  fully  aided  condition  and  the 
experts  was  .71 — accounting  for  50  percent  of  the  variance.  The  mean  cor¬ 
relation  between  the  no  aid  and  the  experts  was  .  30- -accounting  for  nine 
percent  of  the  variance.  This  suggests  a  5  to  1  ratio  which  reflects  the  dif¬ 
ference  between  the  variance  accounted  for  when  using  the  aid  and  when  not 
using  the  aid  in  relation  to  the  expert  opinion  criterion.  Stated  alternatively, 
use  of  the  ASTDA  increased  decision  validity  by  a  factor  of  five. 

Difficulty  and  Confidence  Hating  Data 

After  a  subject  recorded  his  strike  launch  time  choices,  he  was  asked 
about  his  confidence  in  his  decisions  and  how  difficult  it  was  to  arrive  at  the 
decisions.  As  indicated  in  the  earlier  section  on  Problem  Selection,  diffi¬ 
culty  was  defined  in  terms  of  the  spread  of  possible  outcomes  across  potential 
strike  times.  A  rather  strong  negative  correlation  between  the  confidence 
and  the  perceived  difficulty  ratings  (r  =  -.62)  was  evidenced.  However,  no 
correlation  between  the  a  priori  difficulty  values  and  either  the  confidence 
ratings  (r  =  -.  15)  or  the  perceived  difficulty  rat*  ’gs  (r  =  -.03)  was  evidenced. 
This  lack  of  any  relationship  with  the  previously  defined  difficulty  was  sur¬ 
prising  because  the  variance  and  the  regression  analyses  showed  clearly  dif¬ 
ferential  effects  of  difficulty.  It  seems  that  the  subjects  did  not  perceive 
the  problems  in  which  the  BFL  and  EU  wore  the  major  correlate  of  choice 
to  be  more  difficult.  Possibly,  their  perception  of  the  situation  was  one  of 
more  confusion  or  one  which  demanded  more  consideration  but  not  difficulty 
per  so.  However,  the  subjective  report  of  difficulty  does  not  seem  to  be 
associated  with  our  "objective"  measure. 

Multiattribute  Utility 


Method 


An  attempt  was  made  to  evaluate  further  the  perceived  utility  of  the 
aid  by  assessing  how  closely  the  aid  achieved  its  goals.  Six  ASTDA  goals 
were  developed.  They  are  listed  in  Table  14.  One  requirement  of  the 
multiattribute  utility  analytic  technique  (Edwards,  1971)  is  the  relative 


Table  14 


ASTDA  Goals  and  Weights 


Goal  Weight 


1)  To  provide  a  system  to  assist  in  the  derivation 

of  the  best  possible  time  to  launch  an  air  strike.  35 

2)  To  provide  a  method  for  structuring  and  organizing 

available  information  pertinent  to  the  strike  timing 
decision.  5 

3)  To  provide,  given  available  data,  the  possible  re¬ 
sults  of  various  strike  launch  time  decisions.  15 

4)  To  provide  information  about  the  trade-offs  (e.g. , 
own  or  enemy  losses)  relative  to  various  strike 

time  decisions.  25 

5)  To  provide  a  criterion  against  which  strike  timing 

decisions  can  be  evaluated  or  verified.  13 

6)  To  support  the  decision  maker  so  that  various  strike 

timing  decisions  can  be  made  more  quickly/ accurately.  7 
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importance  of  the  goals  (elements)  being  considered.  Judgments  of  goal 
importance  and  the  assignment  of  goal  weights  were  completed  by  two  of 
the  Applied  Psychological  Services'  staff  members  who  were  involved  in 
the  ASTDA  evaluation.  Each  independently  distributed  100  points  among 
the  goals  to  reflect  his  judgment  of  the  importance  of  each  goal.  The 
weights  were  then  compared,  discussed,  adjusted,  and  mutually  agreed 
on.  The  weights  are  included  in  Table  14. 

Each  subject  who  participated  in  the  study  was  asked  in  an  after 
evaluation  interview  to  assign  a  rating  along  a  "0"  to  "100"  scale  on  the 
extent  to  which  the  aid  achieved  each  goal.  By  multiplying  the  weight  of 
a  goal  and  the  mean  of  the  rating  on  the  extent  to  which  the  aid  achieved 
the  goal,  a  utility  measure  for  the  aid  in  reference  to  that  goal  was  ob¬ 
tained.  This  procedure  was  completed  separately  for  each  goal.  The 
resultant  utility  values  are  presented  in  Table  15.  The  top  portion  of 
Table  15  presents  the  data  collapsed  across  conditions,  and  the  lower 
portic.i  presents  the  results  by  experience,  aid  level,  and  background  by 
aid  level. 

The  maximum  value  that  could  be  attained  relative  to  each  goal  and 
the  marginal  total  are  shown  in  parenthesis  at  the  top  of  Table  15.  As  can 
be  seen,  goals  2  and  6  ere  closely  achieved  by  the  aid.  They  were  judged 
to  have  been  about  91  percent  and  S9  percent  satisfied,  respectively.  The 
other  goals  (1,  3,  4,  and  5)  were  rated  as  85,  81,  78,  and  80  percent  sat¬ 
isfied,  respectively.  These  values  seem  rather  impressive. 

Comparison  across  experience  levels  indicates  only  minor  total 
differences  due  to  this  effect.  There  were  only  minor  differences  in  ratings 
within  goals  of  about  3  to  7  percent.  Exceptions  were  the  9  percent  higher 
and  13  percent  lower  ratings  given  to  goals  3  and  5,  respectively,  by  the 
experienced  subjects. 

The  aid  level  data  suggest  that  this  effect  produced  differences  in  per 
ceived  utility  for  the  aid.  Comparing  across  aid  levels  indicates  a  tendency 
for  the  highest  ratings  to  be  given  by  those  subjects  who  were  exposed  to  the 
full  aid  condition.  However,  for  goal  5,  which  was  related  to  providing  a 
criterion  for  evaluating  strike  timing  decisions,  the  highest  ratings  were  ob» 
served  for  the  utility  condition.  The  utility  condition  ratings  for  goal  5  were 
8  to  13  percent  higher  than  for  the  other  goals.  This  finding  may  have  been 
anticipated, because  utility  represents  a  fundamental  comparison  criterion. 
Across  the  other  goals,  only  the  ratings  for  goal  4  possessed  any  substantial 
variability  across  aid  levels.  Goal  4  concerned  own  versus  enemy  losses 
and  was  perceived  by  the  subjects  in  the  utility  condition,  the  condition  for 
which  no  loss  information  was  available,  as  very  far  from  satisfied  by  the 
aid.  For  goal  4,  the  highest  ratings  were  obtained  for  the  full  aid  condition 
and  intermediate  levels  were  obtained  for  the  outcome  and  no  uncertainty 
conditions. 
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Table  15 


G 


Maximum 

Possible 

Results  of  Multiattribute  Utility  Analysis 

(700) 

(3500) 

(500) 

(1500) 

Utility  For 

(2500) 

Goal  (G) 

(1300) 

G1 

G2 

G3 

G4 

G5 

G6 

Overall 

2991 

454 

1210 

1950 

1036 

618 

Experienced 

2916 

443 

1266 

2026 

964 

628 

Inexperienced 

3065 

464 

1154 

1875 

1108 

608 

Aid  Levels 


Full  Aid 

2990 

471 

1269 

2282 

1002 

630 

Utility 

2990 

467 

1237 

1604 

1127 

630 

Outcome 

2972 

431 

1157 

1958 

980 

624 

No  Uncertainty 

3011 

444 

1177 

1958 

1035 

588 

Experienced 


Full  Aid 

2858 

454 

1200 

2313 

845 

624 

Utility 

2858 

467 

1275 

1563 

1137 

659 

Outcome 

2946 

446 

1413 

2208 

953 

618 

No  Uncertainty 

2478 

404 

1175 

2021 

921 

612 

Inexperienced 


Full  Aid 

3121 

487 

1337 

2250 

1159 

636 

Utility 

2730 

467 

1200 

1646 

1116 

601 

Outcome 

2998 

417 

900 

1708 

1007 

630 

No  Uncertainty 

3022 

483 

1180 

1896 

1148 

563 

(10.000) 

Total 

Utility 

8259 

8243 

8274 

8644 

8055 

8122 

8213 

8294 

7959 

8584 

8134 

8990 

7760 

7660 

8292 


£ 


* 
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The  experience  by  aid  level  data  for  goal  4  suggests  that,  higher 
ratings  were  assigned  by  the  subjects  in  the  full  aid,  outcomes,  and  no 
uncertainty  conditions  than  by  the  inexperienced  subjects.  The  informa¬ 
tion  provided  by  the  aid  may  have  been  more  meaningful  to  the  experienced 
subjects. 

Also  showing  some  variability  across  background  by  aid  levels  were 
the  ratings  relative  to  goal  5.  Goal  5  was  concerned  with  the  use  of  the 
ASTDA  as  an  evaluation  criterion.  The  experienced  group  rated  achieve¬ 
ment  of  this  goal  relatively  low,  at  least  in  the  full  aid,  outcomes,  and  no 
uncertainty  conditions. 

The  background  by  aid  level  analysis  also  suggested  some  variability 
relative  to  goal  1 — to  assist  in  the  derivation  of  the  best  possible  strike 
launch  time.  The  results  indicated  a  rather  low  goal  attainment  evaluation 
by  the  experienced  subjects  in  the  no  uncertainty  condition  and  to  a  lesser 
extent  by  the  inexperienced  subjects  in  the  utility  condition. 

After  Evaluation  Interview 

Each  subject,  after  completing  the  eight  scenario  problems,  was  inter¬ 
viewed  relative  to  his  impressions  of  the  ASTDA.  Information  was  sought 
about  usefulness  and  influence  of  various  aspects  of  the  aid. 

Input  Displays 

The  subjects  were  queried  about  the  usefulness  of  the  input  displays. 
They  indicated  their  response  on  a  five  category  rating  scale.  The  mean 
usefulness  ratings  for  the  input  displays  are  presented  in  Table  16.  The 
input  displays  considered  were  the  WAT,  WAC,  BFA,  ORAD,  and  ORGF. 

The  ORGF  information  was  treated  as  a  unit.  The  information  on  the  de¬ 
sired  number  of  blue  (DNB)  was  presented  to  the  subjects  embedded  within 
the  BFA  displays  but  was  rated  separately. 

Overall,  the  ratings  tended  to  vary  between  "3"  and  "4,  *'  i.  e. ,  be¬ 
tween  useful  and  highly  useful.  The  highest  rating,  3.  84,  was  received  by 
the  ORGF  display.  The  BFA  and  the  ORAD  displays  were  rated  as  3.71 
and  3. 62  respectively.  The  lowest  rating,  2. 39  was  assigned  to  the  DNB 
(desired  number  of  Blue).  The  experienced  subjects  generally  rated  the 
input  displays  to  be  more  useful  than  the  inexperienced  subjects.  This 
was  true  for  every  display  except  WAT.  Possibly,  the  experienced  sub¬ 
jects,  by  virtue  of  their  backgrounds,  were  able  to  read  more  into  the 
input  displays  than  the  inexperienced  subjects. 

When  the  data  are  considered  across  aid  levels,  the  input  informa¬ 
tion  was  not  rated  highest  in  the  no  aid  condition,  which  had  only  input 
information  available.  Rather,  on  the  average,  the  highest  rating  was  ob¬ 
served  in  the  utility  condition  which  rated  the  input  at  3.  88.  The  input 
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Table  16 


Mean  Ratings  of  Input  Usefulness 


WAT 

WAC 

F.FA 

DNB 

OR  AD 

ORGF 

Mean 

1 

Overall 

3. 42 

3. 02 

3.71 

2.39 

3. 62 

3.  84 

3.  33 

- 

Experienced 

3.27 

3.19 

3.81 

2.58 

3.  64 

3.90 

3.  39 

Inexperienced 

3.56 

2.86 

3.  60 

2.20 

3.60 

3.77 

3. 27 

Aid  Levels 

- 

z  -  -  .  _  _ 

Full  Aid 

3.58 

3.00 

2.84 

2. 10 

3. 00 

3.42 

2.99 

:V 

Utility 

4.00 

3.  34 

4.75 

2. 67 

4.  17 

4.34 

3. 88 

Outcome 

3.08 

3.  30 

3.58 

2.67 

3. 67 

3.58 

3.25 

No  Uncertainty 

2.84 

2.  64 

3. 2u 

2.44 

3.27 

4.00 

3. 07 

No  Aid 

3.  d8 

3.25 

4.17 

2.09 

4.00 

3.  83 

3.49 

.  -  - 

Experienced 

| 

L-  • 

Full  Aid 

3.  33 

2.  83 

2.  67 

2.20 

3. 00 

3.50 

2.  92 

» 

Utility 

4.17 

4.00 

4.83 

3. 00 

4.67 

4.67 

4.22 

’W"  1 

i  . 

Outcome 

2.33 

2.50 

3.50 

2.33 

4.00 

3. 67 

3. 05 

£ 

No  Uncertainty- 

2. 86 

3.28 

3.  57 

2.71 

2.71 

4.00 

3.19 

.  % 

No  Aid 

3.  67 

3.33 

4.50 

2.67 

3.  S3 

3. 67 

3.61 

- 

Inexperienced 

; 

Full  Aid 

3.  83, 

3.17 

3.00 

2.00 

3.00 

3.  oo 

3. 06 

Utility- 

3. 83 

2.67 

4.67 

2.33 

3.66 

4.00 

3.53 

t  -r 

Outcome 

3.  83 

3. 33 

3.  67 

3.00 

3.33 

3. 50 

3.44 

• 

No  Uncertainty 

2.83 

2.00 

2.83 

2. 17 

3.  83 

4.00 

2.94 

W~-'- 

No  Aid 

3.50 

3.17 

3.  S3 

1.50 

4. 17 

4. 00 

3,36 

information  was  rated  lowest  by  subjects  in  the  full  aid  and  the  no  un¬ 
certainty  condition  with  scores  of  2.  90  and  3.  07,  respectively.  T'-' 
mean  ratings  of  the  subjects  assigned  to  the  outcome  condition,  3.  .so, 
was  slightly  higher. 


These  findings  support  contentions  favoring  the  salience  and  useful¬ 
ness  of  most  of  the  input  information  provided  by  the  ASTDA. 


Outcome  Displays 


A  parallel  set  of  ratings  was  completed  for  the  outcome  displays. 
The  overall  usefulness  mean,  3.95,  was  somewhat  higher  than  the  use¬ 
fulness  mean  for  the  input  information.  The  results,  presented  in  Table 
17,  generally  support  the  usefulness  of  the  outcome  displays. 


The  highest  overall  ratings,  4.50  and  4. 14,  were  assigned  to  the 
ORGL  and  the  BEL  displays,  respectively.  The  ORAL  display  was  rated 
slightly  lower— 3.  21. 

The  comparisons  across  background  levels  suggest  that  the  experi¬ 
enced  subjects  tended  to  rate  the  usefulness  of  the  outcome  information 
lower  than  the  inexperienced  subjects,  except  for  the  ORGL  display. 

Examining  the  data  across  aid  levels  indicates  that  the  outcome 
displays  were  rated  highest  in  the  condition  that  only  had  one  source  of 
output  information  (utility  or  outcome)  available.  The  background  by  aid 
level  analyses  of  Table  17  suggest  that  this  effect  was  only  influential  on 
the  judged  usefulness  of  the  outcome  displays  for  the  experienced  subjects. 
Here,  the  effect  was  strong.  The  outcomes  wore  rated  lower  in  the  full 
aid  and  no  uncertainty  conditions  when  both  sources  of  output  information 
were  available  than  they  wore  when  only  the  outcome  information  was  avail 
able.  Examining  the  inexperienced  subject  data  across  aid  levels  does 
not  indicate  the  same  type  or  degree  of  systematic  variability  as  observed 
in  the  experienced  subject  data. 


Input-Outcome  Interaction 


The  usefulness  ratings  by  the  e.-.-orieueod  subjects  can  be  employed 
to  quantify  further  the  usefulness  of  various  displays.  Usefulness  may  bo 
defined: 
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Table  17 

Mg  an  Ratings  of  Outcome  Usefulness 


BFL 

ORAL 

ORGL 

Mean 

^  . 

Overall 

4.  14 

3.21 

4.50 

3.95 

Experienced 

4.  11 

2.  87 

4.72 

3.90 

Inexperienced 

4.  17 

3.  21 

4.50 

4.00 

&== s 

Aid  Levels 

- 

Full  Aid 

3.68 

3.  33 

4.50 

3.84 

Outcome 

4.42 

3.49 

4.50 

4. 14 

Q 

~V 

No  Uncertainty 

4.  33 

2.  83 

4.50 

3.87 

f. 

Experienced 

Full  Aid 

3.67 

2.  33 

4.50 

3.67 

/•_  !- 

Outcome 

4.50 

3.30 

4.83 

4.21 

'fJ 

>" 

No  Uncertainty 

1.16 

2.50 

4.83 

3.83 

Inexperienced 

- 

Full  Aid 

3.68 

3.  83 

4.50 

4.00 

! 

Outcome 

4.  33 

3.  67 

4. 17 

4.06 

^  i 
] 

- 

No  Uncertainty 

4.50 

3.  17 

4.  17 

3.95 

a 

1 
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where:  VHj  is  the  usefulness  Of  the  input  information  (lip 
Hq  is  the  information  from  outcome  ratings 
Hu  is  the  information  from  utility  ratings 
k  is  a  constant  of  proportionality  (IS  k2.  5). 

Changes  in  VHr  as  a  function  of  changes  in  Htot  al  e  then  an  expression  of 
the  fact  that  all  information  (H)  is  relative,  and  the  importance  (V)  of  one 
information  source  is  indirectly  determined  by  the  availability  of  other 
sources  of  informati  n<.  As  applied  to  the  present  situation,  when  only  in¬ 
put  information  was  made  available,  i.  e. ,  Htot  ~  Hj  -  O: 


VHi  =  k  ,  or 
VHj  =  k. 

When  other  sources  of  information  were  also  available  and  contributed  to 
uncertainty  reduction,  then; 


VHj  >k. 


When  additional  sources  of  information  acted  so  as  to  increase  uncertainty, 
then; 


VHl<  k. 

The  usefulness  rating  assigned  by  the  subjects  (3.  49)  to  the  input  informa¬ 
tion  (Hj),  when  presented  in  the  no  aid  condition,  was  moderately  high.  This 
value  would  be  obtained  when  Hq  +  Hy  =  0  so  that  VHj  »  k,  and  can  be  con¬ 
sidered  as  the  reference  value  against  which  the  effects  of  making  other  in¬ 
formation  available  can  be  compared,  The  usefulness  value  ranged  between 
2.  92  and  3. 19  when  either  HQ  +  Hy,  or  when  only  Hq  was  available  along 
with  Hj.  This  suggests  that  Hq  +  Hy  was  positive,  i,  e. ,  contributed  to  un¬ 
certainty  reduction,  VITj  rose  sharply  to  4,  22  when  only  Hy  was  made  avail¬ 
able,  suggesting  that  Ho  +  Hy  was  negative.  The  utilities  without  the  out¬ 
come  information  did  not  reduce  uncertainty  and  possibly  increased  it. 


The  usefulness  of  outcome  displays  is  given  by: 


VHq  =  k 
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Hj  +  Hq  +  Hy 
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Since  Hj  was  constant  across  all  conditions,  the  only  interest  is  in  the  effect 
of  incrementally  providing  utility  displays  (IIu)  for  experienced  subjects  with 
respect  to  Ho.  When  only  outcome  information  was  made  available,  VHq  * 
4. 21.  Additionally  providing  Hy  reduced  VHq  to  an  average  value  of  3.  75. 
This  decrease  strongly  suggests  that  Hy  increased  Htot,  by  reducing  uncer¬ 
tainty  for  experienced  subjects.  These  findings  allow  some  additional 
insight  into  and  refinement  of  our  understanding  of  expected  utility  (EU)  in 
particular  and  of  the  information  presented  in  general.  Apparently,  EU 
(or  Hy)  only  presented  in  conjunction  with  input  information  (Hj>  had  little, 
if  any,  beneficial  effect,  but,  when  presented  along  with  the  outcomes  (IIq) 
it  did  have  a  beneficial  effect.  This  suggests  a  simple  notion  that  provid¬ 
ing  increased  amounts  of  information  (  A  Iltot)  will  necessarily  increase 
the  value  of  the  information  ( AVHtot)  for  the  user  is  misleading.  Again, 
as  indicated  by  the  variance  analyses,  more  is  not  necessarily  better. 

Influence  of  Outcome  Displays 

The  subjects  were  u.so  asked  to  rate  the  influence  of  the  output  dis¬ 
plays  on  their  strike  timi.'j  decisions.  Again,  five  category  scales  ranging 
from  "no  influence"  to  "very  much  influence"  were  employed.  The  data  de¬ 
rived  from  these  questions  ale  piesented  in  Table  18.  For  the  overall  anal¬ 
ysis,  the  highest  influence  rating  of  4.  47  was  assigned  to  the  ORGL  followed 
by  similar  ratings  of  4.02  and  3.97  for  the  EU  and  the  BFL,  respectively. 
The  lowest  rating,  3.08,  was  received  by  the  ORAL  display. 

The  data  for  the  experienced  subjects  suggest  that  they  were  most 
influenced  by  the  ORGL  outcome  while  the  inexperienced  subjects  were 
most  influenced,  on  the  average,  by  the  EU  information. 

The  comparison  across  backgrounds  by  aid  levels  suggests  that  for 
the  experienced  subjects  the  highest  rating.,  came  from  the  utility  and  out¬ 
come  conditions  with  averages  of  4.  17  and  4. 19,  respectively.  With  almost 
perfect  consistency,  the  outcome  displays  were  rated  lower  in  influence 
in  the  full  aid  and  no  uncertainty  conditions  with  averages  of  3.87  and  3.  81, 
respectively.  That  is  when  both  the  outcome  and  the  utility  displays  were 
available,  they  were  rated  lower  than  when  either  one  set  or  the  other  was 
available. 

The  distribution  of  the  influence  ratings  by  the  inexperienced  sub¬ 
jects  showed  few  systematic  differences  across  aid  levels.  This  finding 
parallels  that  already  reported  in  the  usefulness  ratings. 


In  keeping  with  the  reasoning  and  notation 


VHi  =  k 


Table  18 

Mean  Ratings  of  Output  Influence 


BFL 

ORAL 

ORGL 

EU 

Mean 

Ove  fall 

3.97 

3.08 

4.47 

4.02 

3.  93 

Experienced 

3.  83 

2.  94 

4.72 

3.78 

3.82 

Ine  xpe  r  ie  nc  e  d 

4.  11 

3.  23 

4.  22 

4.  28 

3.96 

Aid  Levels 

Full  Aid 

3.75 

3.08 

4.08 

3.75 

3.67 

Utility 

4.  17 

4.  17 

Outcome 

4.58 

3.  25 

4.75 

4.  19 

No  Uncertainty 

3.58 

2.93 

4.58 

4;  17 

3.  81 

Experienced 

Full  Aid 

3.50 

3.00 

4.33 

3.00 

3.46 

Utilities 

4.33 

4.33 

Outcome  ' 

4.67 

3.  17 

4.83 

4.22 

No  Uncertainty 

3.33 

2.  66 

5.00 

4.00 

3.75 

Inexperienced 

Full  Aid 

4.00 

3.  16 

3.83 

4.50 

3.  87 

Utility 

4.00 

4.00 

Outcome 

4.50 

"  nn 

O*  00 

4.67 

4.  17 

No  Uncertainty 

3.83 

3.  20 

4.  16 

4.  33 

3.  88 
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where  Hi  represents  information  from  display  i.  Therefore,  any  factor 
that  increases  H^ot  “  Hj  will  decrease  VHj.  This  condition  would  be  max¬ 
imal  when,  both,  outcome  and  utility  displays  were  available  and  produced 
positive  modifying  effects  and  a  relatively  low  Vlli.  When  either  utility 
or  outcomes  were  the  only  output  information  available,  VHi  =  4.  28  on  the 
average,  suggesting  that  Htoi;  -  was  comparatively  small.  When,  both, 
outcome  and  utility  displays  were  concurrently  available,  VHj.=  3.  GO  on  the 
average,  suggesting  that  Htot  ~  Hi  was  large.  Therefore,  it  could  be  ar¬ 
gued  that  the  influence  of  information  on  the  strike  timing  decisions,  like 
usefulness,  could  be  considered  relativistic,  being  inversely  related  to 
other  modifying  information  concurrently  available.  Note  that  these  effects 
were  only  consistently  observed  in  the  data  for  the  experienced  subjects. 

The  finding  seems  reasonable  when  one  considers  the  fact  tha'  the  effect  is 
essentially  a  measure  of  the  relative  sensitivity  of  decision  n  akers  to  in¬ 
formation  from  various  sources.  The  extent  to  which  any  set  of  stimuli  is 
informative  depends  on  the  background  and  experience  of  the  decision  maker 
with  respect  to  the  meaning  of  the  information.  Sensitivity,  therefore,  is 
a  function  of  experience.  The  experienced  subjects  were  apparently  more 
sensitive  to  the  information  because  they  have  had  more  experience  with 
strike  launch  situations  and  a  consequent  better  understanding  of  possible 
effects  of  each  set  of  information. 


Discussion  of  Ratings 

The  ratings  provide  some  important  insights  about  the  ASTDA.  The 
experienced  subjects  apparently  considered  the  input  information  to  be  more 
useful  than  the  inexperienced  subjects  and  generally  the  output  information 
was  less  important  to  the  experienced  subjects  than  to  the  inexperienced 
subjects.  However,  both  groups  rated  the  outcome  information  as  the  more 
important.  Consistently  the  ORAL  and  BFL  information  was  indicated  to 
be  the  most  important  to  the  experienced  subjects.  Even  with  this  con¬ 
sistency,  the  experienced  subject  data  indicated  that  they  were  very  sensitive 
to  the  type  of  information  available.  This  finding  provides  some  explanation 
for  the  previously  reported  result  that  our  expert  panel  selected  preemptive 
strike  times  for  two  problems.  Evidently  the  panel  members,  as  compared 
with  our  subjects,  were  differentially  sensitive  to  the  information  provided. 

Other  Opinions 

A  set  of  open  ended  questions  was  included  in  the  interview  to  allow 
for  the  derivation  of  qualitative  information  about  the  aid.  The  information 
from  the  experienced,  fully  aided  subjects  will  be  used  to  draw  a  picture 
of  the  aid  as  they  saw  it.  The  information  from  the  other  subjects  and  aid 
level  conditions  will  then  bo  discussed  in  so  fas  as  it  adds  detail  or  new 
elements  to  the  discussion. 


G4 


Generally,  the  fully  aided,  experienced  subjects  perceived  the  aid 
as  "pretty  helpful"  and  indicated  that  it  "could  be  a  good  tactical  decision 
aid.  "  These  statements  tended  to  be  qualified  on  two  counts:  (1)  that  the 
aid  is  only  as  good  as  the  input  information,  and  (2)  that  the  algorithms 
built  into  the  aid  are  reasonable.  The  concern  about  input  information 
was  rather  consistently  stated.  There  seemed  to  be  a  pervasive  feeling 
that  information  supplied  by  weather  and  by  intelligence  officers  tends  to 
be  less  than  fully  reliable.  Because  a  considerable  part  of  the  ASTDA's 
information  is  based  on  these  sources  (WAT,  WAC,  OR  AD,  and  ORGF),. 
and  because  the  ASTDA's  outputs  are  derived  from  those  sources,  it  was 
generally  indicated  that  for  the  aid  io  be  useful,  tlvis  information  had  to  be 
accurate.  Concerns  about  the  algorithms  were  fewer  and  mostly  related 
to  the  utility  measure.  There  Was  a  tendency  to  question  the  expected 
utilities  as  being  "too  inclusive"  or  "too  general.  " 

The  subjects  rarely  questioned  the  validity  of  the  loss  information 
(BFL,  ORAL,  ORGL).  In  fact  when  the  input  information  was  divergent, 
the  subjects  said  that  they  tended  to  base  their  decisions  on  the  loss,  infor¬ 
mation.  Specifically,  they  said  that  they  attempted  to  balance  the  informa¬ 
tion  on  their  own  losses  and  the  destruction  cf  enemy  targets.  Also,  some 
subjects  said  that  they  compared  their  decision  to  the  EU  information  and, 
if  there  was  an  important  discrepancy,  they  recousidered  their  original 
decision.  However,  when  the  input  information  was  convergent,  and  al¬ 
ternatives  were  rather  obvious,  the  decisions  tended  to  be  based  on  input 
information.  Therefore,  the  usefulness  of  the  output  information  tended 
to  be  somewhat  proportional  to  the  divergence  of  the  input  information. 

Other  interview  questions  examined  specific  aspects  of  the  aid, 
e.  g. ,  tlie  display  formats,  advantages  of  color,  etc.  When  asked  if  the 
tabular  or  the  graphic  information  presentation  was  more  useful,  the  sul  - 
jects  overwhelmingly  chose  the  tabular  format.  They  generally  thought 
that  the  graphs  were  difficult  to  read.  Tlie  importance  of  color  for  the 
graphs  was  also  rated  rather  low --helping  as  a  discriminant  but  no  more. 

When  asked  about  the  usefulness  of  the  averages  add  ranges  dis¬ 
played  on  both  tlie  tables  and  graphs,  the  responses  were  more  variable. 
About  half  of  the  Subjects  said  that  the  averages  were  more  helpful  while 
the  other  half  thought  that  both  the  averages  and:  ranges  were  helpful.  Sub¬ 
jects  who  preferred  the  averages  wanted  to  see  things  at  a  glance  or  to  ob¬ 
tain  a  ready  indicator  while  those  who  preferred  both  Said  that  tlie  averages 
were  deceptive  and  that  tlie  range  information  presented  a  better  picture  of 
what  to  expect. 

When  questioned  as  to  whether  or  not  the  ASTI) A  helped  to  have 
more  confidence  in  decisions,  the  overwhelmingly  answer  was  affirmative. 
The  reasons  given  were  that  the  organization  of  the  information  tended  to 
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focus  thinking  on  a  few  particulars,  to  indicate  possible  trends,  and  pos¬ 
sibly  reduce  human  error.  The  one  person  that  answered  negatively 
qualified  his  statement  by  noting  that  while  not  increasing  his  confidence, 
the  aid  certainly  provided  a  faster  means  for  deriving  a  decision. 

Similar  affirmative  answers  were  also  obtained  when  the  experi¬ 
enced  subjects  were  asked  if  ASTDA  aided  decisions  were  better  decisions 
than  non  ASTDA  aided  decisions,  and  if  they  would  feel  comfortable  using 
ASTDA  during  actual  combat  conditions.  However,  laced  through  their 
positive  responses  to  ASTDA  were  again  the  qualifications  that  they  would 
be  "confident.  "  "comfortable,  "  and  "make  better  decisions"  only  if  the 
information  entered  into  the  system  was  accurate. 

The  Subjects  also  were  asked  about  the  additional  information  which 
the  ASTDA  should  supply  and  if  they  had  any  further  comments.  The  re¬ 
sponses  ranged  over  a  wide  area.  Several  experienced  subjects  indicated 
that  the  aid  should  include  a  psychological  readiness  of  pilots  factor  which 
could  interact  with  other  factors  and  affect  outcomes.  Such  behavioral 
modeling  is  well  within  the  current  state-of-the  art.  It  was  also  suggested 
that  the  aid  does  not  consider  a  range  of  relevant  strategies,  e.g. ,  with  a 
low“px'obability  of  good  visibility  at  the  target,  the  cloud  cover  could  be 
used  strategically  (sending  two  strikes,,  one  above  and  one  below  the  cloud 
cover). 


Similarly,  it  was  suggested  that  the  aid  should  consider  various 
mixes  of  armament  and  ordinance.  Others  thought  that  the  aid  did  not  ad¬ 
dress  some  very  important  points,  e.g.,  search  air  rescue,  refueling  time 
after  launch,  rendezvous  times  and  places,  as  well  as  some  minor  points, 
e.g. ,  aborts  of  the  mission  not  due  to  enemy  actions. 

One  area  of  recurring  concern  had  to  do  with  the  graphic  displays. 

It  was  suggested  repeatedly  that  relevant  graphs  should  be  either  super¬ 
imposed  or  presented  simultaneously  on  a  split  screen,  or  nomographically, 
or  in  some  combination  which  would  simplify  comparisons.  It  was  also 
suggested  that  in  the  outcome  displays  (BFL,  ORAL,  ORGL),  the  lost  or 
destroyed  units  should  be  weighted  and  summed.  In  addition,  it  was  sug¬ 
gested  frequently  that  the  information  on  orange  ground  target  availability 
and  destruction  should  be  separated  from  the  information  on  orange  ground 
defenses  likely  to  be  encountered  and  destroyed. 


The  information  obtained  from  the  experienced  subjects  exposed  to 
the  other  aid  conditions  supplements  the  prior  considerations.  One  aspect 
which  seems  rather  relevant  concerns  the  expressed  need  of  the  utility 
condition  subjects  for  specific  loss  information,  and  by  the  outcome  con¬ 
dition  subjects  for  a  general  measure  of  trade-off  or  utility.  This  finding 
supports  the  need  for  such  information  as  included  in  the  aid.  The  discussion 
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of  influence  ratings  (in  section  on  Influence  of  Outcome  Displays)  also 
indicated  the  augmental/ supplemental  nature  of  the  two  types  of  display. 

We  also  note  our  prior  finding  of  no  statistically  significant  differences 
between  the  experimental  conditions  in  which  either  one  or  the  other  of 
these  types  of  display  was  available  but  the  statistically  significant  super¬ 
iority  of  the  fully  aided  over  the  partially  aided  conditions.  Subjects  front 
these  conditions  also  tended  to  express  the  need  for  accurate  algorithms 
and,  to  a  lesser  extent,  accurate  input  information. 

The  subjects  in  the  utility  condition  generally  tended  to  report 
having  based  their  decisions  on  input  information  but  they  also  tended  to 
be  sensitive  to  differences  in  utilities  across  times,  suggesting  the  need 
for  some  measure  of  significance.  The  answers  and  suggestions  received 
from  the  no  uncertainty  condition  subjects  indicated  little  appreciation  for 
the  mission  uncertainty  indicators.  None  of  the  subjects  reported  the  need 
for  such  measures. 

There  were  also  some  questions  as  to  the  accuracy  of  the  expected 
utility  output  of  the  aid.  As  was  previously  stated,  the  possibility  that  the 
utilities  were  too  inclusive  was  mentioned.  It  was  also  said  that  the  per¬ 
ceived  value  of  the  expected  utility  might  be  enhanced  if  the  user  was  in¬ 
formed  about  what  units  were  included  in  calculating  the  utilities  as  well 
as  specifying  the  value  of  each  included  unit. 

The  reasonableness  of  any  specific  item  of  outcome  information  was 
not  questioned.  This  obtains  in  spite  of  the  stated  need  for  reasonably  ac¬ 
curate  algorithms.  This  result  may  have  been  related  to  the  subjects’  at¬ 
titude  when  asked  if  they  would  use  the  aid  under  live  combat  conditions. 

The  responses  were  generally  that  they  would  certainly  use  it  as  an  additional 
set  of  information  that  would  have  to  be  considered  when  making  a  strike 
timing  decision.  When  asked  for  the  reason,  the  usual  answer  was  that  the 
ASTDA  stands  head  and  shoulders  above  the  competition  (because  there  is 
nothing  to  compare  it  to). 


. . . . 
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DISCUSSION  AND  SUMMARY 


What  then  may  be  said  about  the  value  of  the  ASTDA  in  particular 
and  about  the  implications  of  the  present  work  for  decision  aids  in  general? 

The  results  of  the  present  work  certainly  seem  to  support  contentions 
favoring  the  value  of  the  ASTDA.  The  variance  analyses  consistently  indic¬ 
ated  statistically  significant  differences  between  the  aid  levels  investigated. 
And,  where  differences  were  found,  they  were  between  the  unaided  con¬ 
dition  and  some  level  of  aiding.  On  the  other  hand,  the  results  suggested 
that  although  some  level  of  aiding  helps,  more  is  not  necessarily  better. 

It  seems  that  the  aid's  input  displays  acted  as  an  information  organizational 
tool.  They  set  the  information  that  the  user  wanted  to  consider  into  per¬ 
spective  and  into  meaningful  relationships.  The  user  employed  only  parts 
of  the  information.  Similarly,  multiple  and  vast  arrays  of  output  seemed 
to  add  little.  Once  the  input  was  organized,  the  user  cared  little  about 
multiple  outputs  which  he  could  not  mentally  synthesize  into  an  integrated 
whole.  Accordingly,  he  selected  the  output(s)  or  major  meaningfulness 
to  him  and  rested  with  it  (them).  While  the  number  of  input  and  output  dis¬ 
plays  which  the  user  can  manage  is  not  known,  the  number  is  certainly  few¬ 
er  than  the  number  provided  by  the  ASTDA.  Note  that  the  subjects  in  the 
evaluation  indirectly  voiced  this  same  thought  when  they  suggested,  during 
the  interview,  that  split  screen  displays  and  nomographs  for  relating  dis¬ 
played  information  would  be  helpful. 

One  limitation  of  the  aid  was  its  failure  to  accommodate  situational 
variables  which  may  bias,  and  possibly  override,  the  data  produced  by  the 
aid.  Specifically,  the  results  from  the  aid  did  not  square  with  the  con¬ 
clusions  of  our  panel  on  the  matter  of  preemptive  strikes.  The  aid  did  not 
consider  such  data  biasing  situations.  It  seems  that  such  overriding  vari¬ 
ables  or  contingency  conditions  should  be  taken  into  account  during  the  de¬ 
sign  of  any  decision  aid.  Otherwise,  the  aid  will  fail  to  provide  full  realism 
with  the  result  that  its  acceptability  may  suffer. 

The  interaction  effects  noted  by  the  various  analyses  are  reasonable 
but  add  complexity  to  the  aid  design  problem.  Output  information  was  sig¬ 
nificant  for  "hard”  problems  but  not  for  "easy"  ones.  For  "easy"  problems 
the  input  displays  were  most  meaningful.  Moreover,  there  was  some  evi¬ 
dence  that  the  output  displays  achieve  much  of  their  value  by  virtue  of  their 
ability  to  sensitize  the  user  to  the  input  information.  Seeing  the  projected 
outcome  forces  the  user  to  ask,  in  a  sense,  what  could  cause  that?  He  may 
then  reconsider  an  earlier  decision  or  generate  additional  hypotheses  for 
investigation  through  the  use  of  the  aid.  Both  the  multiattribute  utility  and 
the  expert  panel  data  suggested  that  such  nuances  were  more  readily  perceived 
by  the  experienced  than  by  the  inexperienced  subjects — as  might  have  been 
anticipated. 


Certainly,  the  ASTDA  has  achieved  its  goals,  as  defined  within 
the  multiattribute  utility  analysis,  to  a  considerable  extent.  The  multi- 
attribute  utility  of  the  aid  across  the  six  goals  considered  was  83  percent 
of  the  total  possible  utility.  Utility  relative  to  individual  goals  ranged 
from  79  percent  to  91  percent  of  the  total  possible.  These  values  seem 
quite  high--especially  since  the  ASTDA,  as  tested,  was  not  necessarily 
in  its  final  form. 

From  the  point  of  view  of  the  relative  merit  of  the  aid,  our  anal¬ 
ysis  indicated  an  increase  in  decision  validity  by  a  factor  of  five  when 
unaided  are  compared  with  aided  decisions. 

The  after  experiment  interview  indicated  a  number  of  areas  for 
attention  within  the  ASTDA  itself.  These  generally  included  uncluttering 
and  integrating  the  various  displays.  Moreover,  according  to  our  subjects, 
the  color  feature  and  the  color  graphics  added  little.  Some  cost  savings 
might  be  implemented  by  eliminating  these  factors.  Certainly,  the  oper¬ 
ational  acceptance  of  the  ASTDA  will  depend  on  the  availability  of  reliable 
input  information  and  on  the  faith  of  the  user  in  the  internal  algorithms. 

We  are  in  no  position  to  judge  either  the  reliability  of  the  input  information 
demanded  by  the  ASTDA  or  the  validity  of  its  algorithms.  However,  any 
ultimate,  fleet  user's  orientation  should  address  and  make  information 
available  about  these  issues. 

None -the -less,  the  subjects  found  the  ASTDA  to  provide  useful 
information  which  influenced  their  strike  timing  decisions.  The  experi¬ 
enced  Navy  flight  officer  subjects  said  that  they  believed  ASTDA  aided 
decisions  to  be  superior  to  nonaided  decisions  and  that  they  would  feel 
comfortable  using  the  ASTDA  during  actual  combat. 

Implications  for  Future  Evaluations 

The  present  work  also  provided  a  number  of  methodological  insights 
which  should  be  considered  in  any  evaluation  of  a  decision  aid  which  is  con¬ 
ducted  in  the  future. 

First,  the  riterion  problem  remains  open.  The  present  study  at¬ 
tempted  to  come  to  grips  with  the  criterion  problem,  at  least  partially, 
by  employing  two  criteria.  The  use  of  multiple  criteria  has  been  advocated 
in  other  fields  (e.  g. ,  test  development).  But,  such  an  approach  does  not 
provide  an  answer  to  criterion  reliability  problems.  We  have  no  data  rel¬ 
ative  to  the  reliability  of  the  launch  time  judgments  of  our  criterion  panel 
and  the  question  of  whether  or  not  our  panel  would  provide  similar  results 
on  a  retest  remains  open.  Moreover,  the  panel  rankings  did  not  agree 
entirely  with  our  second  criterion,  the  utility  rankings  produced  by  the  aid. 
The  reasons  for  this  were  given  earlier.  However,  if  two  criteria  agree 
only  moderately  or  disagree,  how  can  one  expect  to  obtain  significant 


validity  for  the  aid  against  each  of  the  criteria  taken  separately?  Surely, 
if  the  aid  agrees  with  one  criterion  it  will  disagree  with  the  other.  If  a 
mathematical  solution  to  a  problem  disagrees  with  the  best  judgment  of 
management,  which  course  of  action  is  to  be  preferred?  Management 
will  want  to  know  the  assumptions  of  the  mathematical  solution  and  methods 
in  such  a  case  and,  once  aware  of  these,  management  may  or  may  not  ac¬ 
cept  the  mathematical  solution.  Additionally,  in  the  present  case,  the 
total  situation  becomes  more  circular  because  the  mathematical  solution 
(utility)  was  itself  a  part  of  the  aid  and  was  available  to  the  subjects  in  three 
of  the  four  aided  conditions  (Exhibit  II).  Accordingly,  employing  this  cri¬ 
terion  presents  the  situation  of  assessing  the  aid  against  its  own  output. 

Yet,  surprisingly,  our  data  did  hot  indicate  a  strong  reliance  by  the  sub¬ 
jects  on  the  expected  utility  output  of  the  aid.  Possibly  they  did  not  under¬ 
stand  or  trust  the  utility  construct.  In  sum,  although  we  employed  multiple 
criteria  and  continue  to  advocate  such  an  approach  in  aid  evaluation,  such 
an  approach  does  not  compensate  for  criterion  weakness. 

Second,  any  aid  evaluation  will  depend  on  the  state  of  development  of 
an  aid  at  the  time  at  which  it  is  evaluated.  Evaluating  too  early  may  result 
in  an  injustice  to  an  aid  because  the  aid  developers  may  not  have  had  suf¬ 
ficient  opportunity  to  refine  their  design.  Evaluating  too  late  may  allow  er¬ 
rors  to  go  unrecognized  until  it  is  too  late  to  do  anything  about  them.  Accord¬ 
ingly.  as  suggested  by  Figure  1,  aid  evaluation  may  need  to  be  viewed 
against  a  continuum  rather  than  as  a  process  to  be  carried  out  at  a  specific 
point  in  time;  And,  any  evaluative  results  are  pertinent  only  to  the  state  of 
the  aid  at  the  time  at  which  it  was  evaluated. 

When  one  is  involved  with  laboratory  experiments,  he  must  be  content 
with  intermediate  criteria.  Such  intermediate  criteria  are  more  often  than 
not  based  on  matters  of  practicality  rather  than  true  relevence  to  the  ultimate 
criterion;  Similarly,  criterion  sensitivity  becomes  an  issue.  Our  failure 
to  find  differences  between  some  of  the  levels  of  aiding  may  be  a  function  of 
lack  of  criterion  sensitivity  rather  than  any  failure  of  the  aid.  The  subjects 
may  have  worked  harder  in  one  condition  as  compared  with  another.  But, 
this  was  not  measured  by  the  criteria  employed. 

In  retrospect,  it  becomes  apparent  that  the  use  of  the  inexperienced 
group  as  subjects  may  not  have  been  warranted.  While  the  information/ data 
from  such  subjects  is  of  theoretic  interest,  such  theoretic  excursions  are 
costly.  After  all,  in  actual  practice,  one  can  anticipate  that  decisions,  such 
as  those  with  which  we  were  concerned  in  the  present  work,  will  be  made  by 
experienced  persons. 

The  interactive  and  the  moderator  effects  which  were  evident  in 
several  of  the  analyses  point  up  the  fact,  known  at  the  outset,  that  aid  de¬ 
velopment  and  aid  evaluation  are  not  easy  ways  to  pass  one's  time.  Evaluations 
must  be  carefully  designed  to  allow  for  the  identification  of  such  effects,  if 
present.  Barren  research  designs  will  miss  such  nuances.  And,  these  may 
be  more  important  than  "main"  effects. 


Finally,  we  note  that  a  number  of  our  analyses  were  based  on  cor¬ 
relational  methods.  Correlation  implies  strength  of  association--not 
causality.  Because  correlation  is  a  fundamental  tool  of  the  behavioral 
sciences,  a  number  of  techniques  have  been  developed  which  allow  one  to 
go  beyond  mere  statements  of  relationship  on  the  basis  of  correlation 
and  to  derive  statements  of  causality.  Most,  if  not  all,  of  these  methods 
are  based  on  structural  equation  models,  and  the  models  have  been  vari¬ 
ously  referred  to  as  simultaneous  equation  systems,  linear  causal  analysis, 
path  analysis,  structural  equation  models,  dependence  analysis,  cross- 
lagged  correlation,  and  the  like.  The  end  result  is  statements  of  cause  and 
effect  and  because  the  end  result  represents  a  causal  link  rather  than  a 
measure  of  association,  the  structural  results  do  not  coincide,  in  general, 
with  coefficients  of  regression  among  observed  variables. 

Background  material  on  structural  equation  models  may  be  found 
in  Heise  (1975),  Duncan  (1975),  and  Goldberger  (1972).  Two  volumes  by 
Blalock  (1971,  1974)  contain  several  papers  dealing  with  basic  issues  and 
problems  at  an  elementary  level.  At  a  more  advanced  level,  two  recent 
volumes,  Goldberger  and  Duncan  (1973)  and  Aigner  and  Goldberger  (1977) 
cover  several  issues,  problems,  and  applications.  Bielby  and  Hauser  (1977) 
gave  an  excellent  review  of  the  sociological  literature  on  structural  equation 
models. 

One  of  the  more  recent  and  sophisticated  techniques  whicn  relies 
on  linear  structural  equations  was  described  by  Joreskog  and  Sorbon 
(1978) — analysis  of  linear  structural  relationships  by  the  method  of  max¬ 
imum  likelihood  (LISREL).  The  LlSREL  model  was  designed  to  handle 
models  with  latent  variables,  measurement  errors,  and  reciprocal  causation. 
The  model  seeks  to  establish  whether  or  not  a  causal  relationship  exists 
among  a  set  of  latent  variables — some  ol  which  are  designated  as  independent 
variables  while  others  are  designated  as  dependent  variables.  The  procedure 
also  possesses  the  important  advantage  that  it  requires  measurement  at  on¬ 
ly  one  point  in  time--a  distinct  practical  advantage.  The  structural  equa¬ 
tion  model  specifies  the  causal  relationship  among  the  latent  variables  and 
the  amount  of  unexplained  variance.  Joreskog  and  Sorbon  (197S)  published 
and  made  available  a  general  LISREL  computer  program  for  IBM  systems 
for  deriving  the  required  structural  equations.  It  would  seem  that  such 
causal  relationships  should  be  explored  in  future  evaluations  of  the  sort 
reported  here. 


Original  Hypotheses 

In  Chapter  I,  five  hypothese  were  established  to  provide  a  basis  for 
the  present  evaluation.  Each  of  these  is  now  discussed.  Hypothesis  1,  more 
effective  strike  time  decisions  can  be  made  with  the  aid  then  without  the  aid, 
seems  to  be  quite  strongly  supported  by  the  data.  If  effectiveness  is  defined 
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as  how  closely  the  subjects’  decisions  approximated  the  predictions  of 
tBe^lalatSSn  ASTDAS'effectiyeness  was  certainly  demonstrated  by  the 
rather  consistent  differences  observed  between  the  full  aid  and  the  no 
aidicdnditions.  Arguments  supporting  the  apparent  effectiveness  of  the 
aid  seem  to  be  further  supported  by  the  findings  that,  with  only  parts  of 
the  information  provided  by  the  aid,  the  quality  of  the  decisions  did  not  suf¬ 
fer.  The  lack  of  differential  effectiveness  across  the  aided  conditions  sug¬ 
gests  that  Hypothesis  5,  decision  effectiveness  will  vary  systematically  as 
the  characteristics  of  the  aid  are  varied,  cannot  be  supported  or  accepted. 
Whether  or  not  the  aid's  design  represents  a  case  of  overkill  is  difficult 
to  say.  Perhaps  the  obvious  has  been  restated,  i.  e. ,  the  human  can  only 
manage  a  limited  amount  of  information  at  one  time.  The  effectiveness  of 
the  ASTDA  is  probably  at  least  a  partial  function  of  its  sensitizing,  feed¬ 
back  mechanisms. 

The  algorithmic  logic  of  the  ASTDA  is  not. obvious  to  the  usei — 
especially  how  nontransparent  changes  in  particular  variables  might  in¬ 
fluence  outcomes.  It  might  make  sense  to  provide  the  user  with  this  type 
of  information.  This  information  might  be  particularly  important  if  the 
aid  were  accepted  for  use  in  the  fleet  and  might  be  ah  integral  part  of  the 
training  of:  (1)  personnel  who  use  the  aid,  and  (2)  those  who  review  any 
decisions  derived  from  the  use  of  the  aid.  In  addition,  such  information 
would  probably  be  useful  in  allowing  a  user  to  get  a  "feel"  for  the  signif- 
icanee-df  the  various  aspects  of  the  aid. 

Hypothesis  2,  users  will  perceive  the  aid  to  possess  value  (utilityX 
was  also  supported  by  the  data  from  a  variety  of  sources.  In  terms  of 
overall  assessment,  both  the  data  from  the  perceived  utility  in  relation 
to  the  goals  and  the  data  from  the  interview  suggest  that  the  ASTDA  pos¬ 
sesses  qualified  value.  The  qualifications,  as  already  stated,  concern  the 
input  information  and  the  aid's  algorithms. 

Hypothesis  3,  effectiveness  and  perceived  value  will  not  vary  as  a 
function  of  the  user's  experience  or  problem  difficulty,  cannot  be  unequiv¬ 
ocally  accepted.  The  effectiveness  of  the  decisions  did  not  vary  with  ex¬ 
perience  but  did  vary  with  difficulty.  However,  the  first  order  inter- 
action  indicated  that  most  of  the  effects  of  difficulty  were  produced  in  the 
no  aid  condition.  Perceived  value  of  the  aid,  depending  on  the  measure, 
fluctuated  only  in  minor  ways  with  experience.  This  was  certainly  true  of 
the  perceived  utility  in  relation  to  the  ASTDA  goals.  However,  the  subjects* 
assessments  during  the  interview  suggest  that  perceived  utility  may  increase 
as  a  function  of  difficulty  or  divergence  of  the  information. 

Support  for  the  Hypothesis  4,  the  strike  timing  decision  aid  possesses 
criterion  related  validity,  varies  in  accordance  with  the  set  of  assumptions 
one  accepts.  Using  the  expert  panel's  judgments  as  the  validity  criterion 


and  assuming  that  the  aid  should  rank  order  alternate  strike  times. in  a 
manner  congruent  with  those  of  the  criterion,  suggests  moderate  criterion 
related  validity  for  the.  aid.  If  it  is  assumed  that  the  ASTDAs  validity 
should  be  assessed  only  within  the  limited  area  that  it  addresses,  then 
the  agreement  level  increases.  That  is,  the  validity  rises  if  the  possibility 
of  preemptive  strikes  is  not  considered.  If  it  is  further  assumed  that  the 
only  important  decision  facing  a  task  force  fwght  operations  officer  is  the 
one  time  to  launch  art  air  strike  rather  than  a  ranking  of  times,  then  the 
apparent  validity  based  on  agreement  rises  considerably. 
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A  STD  A  POST  EXPERIMENTAL  DEBRIEFING  FORM 


Subject  Name _ _ _ 

Subject  ID  Number _ 

Treatment  Condition 


Interviewer  1 

As  the  final  part  of  experiment,  I  have  around  20  questions  to  ask  you.  These 
questions  concern- your  opinions  about  the  various  features  of  ASTDA  and  their 
value.  Okay. 


Please  tell  me,  in  general  terms,  about  how  useful,  if  at  all  useful,  the 
ASTDA  was  to  yoU  for  Coming  to  a  direct  strike  timing  decision? 


2.  How  did  the  ASTDA  help  you  most  for  deriving  a  strike  timing  decision  ? 
How  so  ?  Least  ?  How  so  ? 

Most  help:  _ _  _  _ _ _ _  _  _  , 


How  so 


Least  help 


How  so 


3.  What  aspects  of  the  ASTDA  Were  most  confusing  to  you?  How  so? 
Most  confusing  aspects;  _  .  „  _  __ 


4.  Would  you  say  that  ASTDA  was  (a)  more  useful  in  deriving  a  strike  timing 
decision  for  some  problems  than  for  others  or  (b)  was  ASTDA  equally  use¬ 
ful  for  all  problems?  How  so? 

_ _ (a)  (b) 

How  so;  _ _ _  .  .....  ......  _  .... 


5. 


With  this  usefulness  rating  scale  (show  card  with  scale)  rate  the  usefulness 
of  the  information  for  deriving  your  strike  time  decisions.  What  is  your 
rating  for  the  information  on: 


Readiness  and  Weather  Report  Time . .  1 

Air  Strike  Mission  Structure . . . . .  1 

Weather  at  Target  . . . .  1 

W  ather  at  Carrier  . . . . . .  1 

Blue  Force  Readiness . . . . . . . . .  1 

Desired  Number  of  Blue  Aircraft  . . . . . .  1 

Orange  Air  Defense . . . . .  1 

Orange  Ground  Force  . . .  1 


Rating 
2  3  4  5 

2  3  4  5 

2  3  4  5 

2  3  4  5 

2  3  4  5 

2  3  4  5 

2  3  4  5 

2  3  4  5 


6s 


(For  treatment  conditions  1,  2,  and  4  only) 

How  much  influence  did  the  utility  outcome  values  have  on  your 
decisions?  Would  you  say; 

no  influence  at  all . . 

very  little  influence  ............ 

some  influence . . . . 

much  influence . 

very  much  influence  . . . 


stike  timing 

Rating 

1 

2 

3 

4 

5 


7.  (For  treatment  conditions  1,  3,  and  4  only) 

With  this  rating  scale  (show  card  with  usefulness  scale),  rate  the  useful¬ 
ness  of  the  following  combat  loss  information; 


1.  blue  force  air  losses  versus  strike  time. 

2.  orange  air  losses  versus  strike  time. , . , 

3.  orange  ground  losses  versus  strike  time 


1  2  3  4  5 
1  2  3  4  5 
1  2  3  4  5 


8.  (For  treatment  conditions  1,  3,  anti  4  only) 

Using  this  scale  (show  card  with  influence  scale),  rate  how  much  influence 
each  combat  loss  information  aspect  had  on  your  strike  timing  decisions: 

1.  blue  force  air  losses  versus  Strike  time 

2.  orange  air  losses  versus  strike  time 

3.  orange  ground  losses  versus  strike  time 

9^  Tell  me  the  order  of  usefulness --from  Most  to  least  useful- -Of  each  of  these 
(show  cards  with  types  of  information)  in  helping  you  decide  on  the  time  to 
launch  an  air  strike.  Which  was  first  most  useful,  which  was  second,  and 
so  on  until  you  have  ordered  the  information  from  first  to  last? 

(For  treatment  condition(s)  only) 

ALL  Readiness  and  Weather  Report  Time 

ALL  Air  Strike  Mission  Structure 

ALL  Weather  at  Target 

ALL  Weather  at  Carrier 

ALL  Blue  Force  Readiness 

ALL  Desired  Number  of  Blue  Aircraft 

ALL  Orange  Air  Defense 

ALL  Orange  Ground  Force 

(1,  2,  and  4  only)  Utility  Outcome  Values 
(1,  3,  and  4  only)  Blue  Force  Air  Losses 
(1,  3,  and  4  only)  Orange  Air  Losses  Versus  Strike  Time 
(1,  3,  and  4  only)  Orange  Ground  Losses  versus  Strike  Time 

10.  Was  the  tabular  or  graphic  information  more  useful,  or  were  both  the 
tables  and  graphs  equally  useful?  Why  do  you  think  so? 

L  tables  more  useful 

2.  graphs  more  useful 

3.  both  equally  useful 


Rating 

1  2  3  4  5 
1  2  3  4  5 
1  2  3  4  5 


ii,  (For  treatment  conditions  1,  2,  and  3  Only) 

Were  the  averages  or  the  ranges  provided  on  the  tables  more  useful,  or 
were  the  averages  and  ranges  both  equally  useful?  How  so? 

I.  averages  more  useful 
2s  ranges  more  useful 
3.  both  equally  useful 

How  so? _  _  ...  . 


12,  Using  the  usefulness  scale  (show  card  with  scale),  rate  how  useful,  if  at 
all,  the  coloring  of  the  graphs. 

Rating 

1  2  3  4  5 

Why  did  you  rate  this  so  ?  .  _  _ _ .  _ _ _ 


13.  (For  treatment  condition  1,  2,  arJ  3  only) 

Using  the  usefulness  scale  (show  card  with  scale),  rate  the  usefulness  of  the 
delta-biased  uncertainty  bands. 

Rating 

1  2  3  4  5 

Why  did  you  rate  this  so  ?  _  _ _ _ _ 


14,  What,  if  anything,  was  confusing  eKout  the  tables?  Which  tables  were  confusing? 
How  so  ? 


20,  (For  experienced  group  only) 

Would  you  feel  "comfortable"  in  using  the  ASTDA  or  some  variant  of  it.under 
live  and  real  combat  conditions?  Do  you  think  other  air  operations  officers 
would  ?  Why  ? 


You: 

Others: 


Why? 


2li  (For  experienced  group  only) 

Would  you  be  surprised  if  the  actual  blue  and  orange  combat  losses  during  a 
strike  mission  were  far  worse  or  far  better  than  the  losses  predicted  by  the 
ASTDA?  Why  do  you  think  so? 

Yes  _ No 

Why  ? _ _ . _ _ 


22.  Do  you  have  any  other  comments  ? 

..  Yes _ No 

Why? 


THANK  YOU  FOR  YOUR  COOPERATION 


84 


.  .  ,  Xou  have  just  read  goals  of  ASTDA.  Now,  I  want  you  to  indicate 
helow  how  closely  each  goal  was  achieved  by  the  decision  aid.  For  each  goal 

inn/  t0  a,.100To  1,atinS:  0%  means  the  goal  was  not  at  all  attained,  an/ 
lOO/o  means  the  goal  was  completely  achieved.  Use  whatever  %  you  feel  ex¬ 
presses  ASTDAs  achievement  of  the  ;gdai. 


Goal  1 


i< 


Appendix  B 

Zero  Order  Correlation  Matrices  from  Which  the  Multiple 
Correlation  Coefficients  and  Regression  Equations  Were  Computed 
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